Network Intelligence System
Unified architecture for connection paths (007), relationship strength (008), and key connection β built on shared Sidekiq + Postgres caching infrastructure with Findem enrichment.
Drafted 2026-04-22 Β· Updated 2026-04-28 Β· Status: review Β· Audience: backend + admin-portal engineers
1. Overview
Ingest email and calendar metadata from each admin user's Google Workspace or Microsoft 365 account and turn it into three transparent relationship indicators on contacts:
- Connection strength β per team-member Γ contact: Warm Known Cold
- Contact reachability β per team Γ contact: High / Medium / Low / None
- Key connection β the best person on the team to ask for an intro
Anchor decisions
- Keep
ContactandContactEmailstructure unchanged. Collection scoping stays. - Use Findem enrichment as the dedup oracle (email β canonical identity β Getro Contact).
- Two-layer scoring: admin rules assign the tier, weighted score sorts within tier.
- Never create contacts from email metadata. Unattributed interactions stage and resolve lazily.
- Ship a thin V1 (email + calendar, no Findem, no user enrichment) first; upgrade in place.
2. System architecture
Five pipeline stages β ingestion, resolution, rollup, enrichment, scoring β each isolated, each restartable independently. No stage writes outside its own tables.
(gmail.metadata)"] GC["Google Calendar
(calendar.readonly)"] OM["Outlook Mail
(Mail.ReadBasic)"] OC["Outlook Calendar
(Calendars.ReadBasic)"] FI["Findem"] end subgraph Ingest["Ingestion (per user)"] GMS[GmailSyncer] GCS[GoogleCalSyncer] OMS[OutlookMailSyncer] OCS[OutlookCalSyncer] end IE[("InteractionEvent
user Β· kind Β· direction
occurred_at Β· thread_id
contact_email Β· contact_id?")] subgraph Resolve["Resolution"] FER[FindemEnrichmentResolver] CEL[(EnrichedEmailLookup
cache)] end subgraph Rollup["Rollup"] RW[NightlyRollupWorker] UCIS[("UserContactInteractionStats")] end subgraph Enrich["User enrichment"] UES[Users::Enrichment::FindemSyncer] UEP[("UserEnrichedProfile
UserWorkExperience
UserEducation")] WOC[(contact_connections
kind=work_overlap
kind=education_overlap)] end subgraph Score["Scoring"] RSS[RelationshipStrengthService] RS[ReachabilityService] UCC[("UCC: strength_tier
strength_score
strength_override")] CR[("ContactReachability
tier Β· counts
key_user_id")] end GM --> GMS --> IE GC --> GCS --> IE OM --> OMS --> IE OC --> OCS --> IE IE --> FER FER -.->|cache| CEL FER -.->|lookup| FI FER -->|contact_id| IE IE --> RW --> UCIS FI --> UES --> UEP --> WOC UCIS --> RSS WOC --> RSS RSS --> UCC UCC --> RS --> CR classDef provider fill:#e0f2fe,stroke:#0369a1 classDef data fill:#fef3c7,stroke:#b45309 classDef service fill:#f3e8ff,stroke:#7c3aed class GM,GC,OM,OC,FI provider class IE,CEL,UCIS,UEP,WOC,UCC,CR data class GMS,GCS,OMS,OCS,FER,RW,UES,RSS,RS service
InteractionEvent is provider-agnostic by construction β Gmail, Outlook, calendars all land in the same shape. The scoring engine has no idea where an interaction came from. Adding Slack or Zoom later would be an enum value, not a new pipeline. See DR-01.Stage responsibilities
| Stage | Input | Output | Trigger |
|---|---|---|---|
| Ingestion | Provider APIs (metadata only) | InteractionEvent rows | Daily cron + on-demand backfill |
| Resolution | New/unresolved InteractionEvent + Contact creates | contact_id populated | after_commit on insert + Contact hooks |
| Rollup | InteractionEvent | UserContactInteractionStats | Nightly full + incremental on event insert |
| User enrichment | User email / linkedin handle | UserEnrichedProfile, work/edu records | On account connect + weekly refresh + Findem webhook |
| Scoring | Stats + overlaps | UCC.strength_tier, ContactReachability | On stats change + on override edit + nightly |
3. Data model
Six new tables. No changes to Contact, ContactEmail, or Collection. Three existing tables get additive columns. CONTACT_CONNECTION (the unified edge table β see Β§3.1) is the only one shipped today; the others are planned across phases 1β5.
New columns on existing UCC table
| Column | Type | Notes |
|---|---|---|
strength_tier | enum (warm, known, cold, nullable) | Null until first computation |
strength_score | float, nullable | Within-tier sort, hidden from users in V1 |
strength_override | enum, nullable | User-set override; rule result still stored for audit |
strength_computed_at | timestamp | Last rule-engine run |
Enum values (stored as strings / Rails enums)
INTERACTION_EVENT.provider:gmail,outlook,gcal,ocalINTERACTION_EVENT.kind:email,meetingINTERACTION_EVENT.direction:inbound,outbound,attendedUCC.strength_tier/strength_override:warm,known,coldCONTACT_REACHABILITY.tier:high,medium,low,none
Fields excluded from the diagram for brevity
USER_CONTACT_STATSalso hasemail_inbound_count_24mo,email_outbound_count_24mo,email_last_inbound_at.USER_ENRICHED_PROFILE.rawis actuallyjsonb;INTERACTION_EVENT.raw_headersisjsonb.INTERACTION_EVENT.contact_idis nullable (intentional β unresolved interactions).
strength_override is nullable so the absence of a user decision is explicit β we can always tell whether tier was computed or set manually. See DR-06.Why these shapes
InteractionEventis the single normalized store. Gmail, Outlook, calendars all land here. Thecontact_idbeing nullable is deliberate: unresolved interactions wait for resolution rather than forcing a premature contact creation.UserContactInteractionStatsis a materialized cache, not the source of truth. Nightly rebuild + incremental updates on event insert.email_activity_bitmap_12qis a 12-bit int encoding 2-way activity per quarter β cheap to maintain, powerful for sustained-relationship detection.- UCC gets four columns, not a new table. Strength lives where the team-memberβcontact link already lives. Override is a nullable enum so the absence of an override is explicit.
ContactReachabilityis denormalized per (contact, collection). A cheap row-level recompute beats a per-request aggregation over hundreds of UCCs.UserEnrichedProfileintentionally mirrorsContactEnrichedProfileβ same enrichment engine (Findem), same shape, same webhook wiring.
3.1 ContactConnection β the unified edge table
Every connection between two contacts β work overlap, education overlap, investor relationship, email exchange, calendar meeting, LinkedIn 1st-degree β is one row in ContactConnection, distinguished by a kind enum. This single table replaces what an earlier draft of this spec called separate tables (UserContactInteractionStats, WorkOverlapCache, EducationOverlapCache, the per-type edge tables). It serves both consumers: 007 graph traversal AND 008 strength rules.
Why one table, not many
The data semantics are identical. UserContactInteractionStats(user_id, contact_id) is "team user β contact relationship" β and a team user is itself a contact (USERS ||--o| CONTACTS). So every relationship is contact Γ contact, varying only by what kind of signal connects them. Adding a new signal type (e.g., shared_event, introduced_by) becomes a new enum value, not a new table + worker + migration.
Schema
CREATE TABLE contact_connections (
id bigserial PRIMARY KEY,
contact_a_id int NOT NULL,
contact_b_id int NOT NULL,
kind enum NOT NULL,
-- work_overlap | education_overlap |
-- investment_overlap | board_peer |
-- email_exchange | meeting | linkedin_connection
-- For time-bound bridge edges (work, edu, investment, board)
bridge_entity_type enum NULL, -- organization | school | company
bridge_entity_id int NULL,
date_from date NULL,
date_to date NULL,
-- For interaction edges (email, meeting, linkedin)
last_signal_at timestamptz NULL,
-- Universal
strength_score numeric NOT NULL,
metadata jsonb NOT NULL DEFAULT '{}',
computed_at timestamptz NOT NULL,
CHECK (contact_a_id <> contact_b_id),
UNIQUE (contact_a_id, contact_b_id, kind, COALESCE(bridge_entity_id, 0))
);
CREATE INDEX idx_cc_a_kind ON contact_connections (contact_a_id, kind);
CREATE INDEX idx_cc_b_kind ON contact_connections (contact_b_id, kind);
CREATE INDEX idx_cc_recent ON contact_connections (last_signal_at DESC NULLS LAST);
CREATE INDEX idx_cc_bridge ON contact_connections (bridge_entity_type, bridge_entity_id) WHERE bridge_entity_id IS NOT NULL;
The CHECK (contact_a_id <> contact_b_id) prevents self-edges. The composite uniqueness ensures one edge per (pair, kind, bridge): two contacts can have multiple work_overlap edges if they shared multiple past employers, but only one row per shared employer.
Kind enum and what each row carries
| kind | bridge entity | Date columns used | Typical metadata | Phase |
|---|---|---|---|---|
work_overlap |
organization | date_from, date_to (overlap window) | title_a, title_b, department_overlap, small_org_at_overlap | v1 |
education_overlap |
school | date_from, date_to | degree_a, degree_b, major_overlap | v2 |
investment_overlap |
company | β | round_a, round_b, role_a, role_b (investor / advisor / etc.) | v3 |
board_peer |
company | date_from, date_to (tenure overlap) | role_a, role_b | v3 |
email_exchange |
β | last_signal_at (most recent email) | inbound_count_12mo, outbound_count_12mo, inbound_nnl_count_24mo, activity_bitmap_12q, last_outbound_at, has_response_in_3mo, newsletter_only | v1 |
meeting |
β | last_signal_at (most recent meeting) | count_180d, count_24mo | v1 |
linkedin_connection |
β | last_signal_at (when discovered/refreshed) | source ('user_import' / 'findem_enrichment'), uploader_id | v2 (gated on Findem F8 if exposed) |
How the strength engine reads it
def evaluate_strength(contact_a_id, contact_b_id)
rows = ContactConnection
.where(contact_a_id: contact_a_id, contact_b_id: contact_b_id)
.index_by(&:kind)
return :warm if w1_match?(rows['email_exchange']) # 5+ each direction in 12mo
return :warm if w2_match?(rows['email_exchange']) # sustained 2yr bitmap
return :warm if w3_match?(rows) # multi-kind: email + work/linkedin
return :warm if w4_match?(rows['meeting']) # meeting in 180d
return :warm if w5_match?(rows['work_overlap']) # small-org coemployment
# ... K1-K6, C1-C5
:cold
end
def w1_match?(email_row)
return false unless email_row
md = email_row.metadata
md['inbound_count_12mo'].to_i >= 5 &&
md['outbound_count_12mo'].to_i >= 5
end
Each rule explicitly names which kinds it depends on. Read pattern is one query per pair; downstream logic is a hash-keyed lookup. No cross-table UNION ALL, no JOIN gymnastics.
Per-list rollup tables (still separate β different consumer)
Two rollup tables remain separate from ContactConnection because they're aggregations, not edges:
| Table | Key | Purpose | Phase |
|---|---|---|---|
SharedListNetworkSummary |
(shared_list_id, organization_id, viewer_scope_digest) | Per-org-per-list rollup for 007's hot list-view query: direct_count, intro_count_*, strength_score, top_intro_preview (jsonb). Built by aggregating ContactConnection rows. |
v1 |
ContactReachability |
(contact_id, collection_id) | 008's reachability rollup: counts of Warm/Known/Cold owners per (contact, collection). Built by aggregating UCC strength tiers. | v1 |
These tables exist because the page-load read budget (50ms p99) requires single indexed lookups. Re-running the rule engine on every list-view render is too expensive; the rollup is built at write time.
UserContactInteractionStats, WorkOverlapCache, EducationOverlapCache, InvestorOverlapCache, plus per-type edge tables. All of that is now ContactConnection rows distinguished by kind. The rollup tables (SharedListNetworkSummary, ContactReachability) stay separate because they're aggregations, not edges. See ADR-007-A and ADR-007-B for the decision rationale.For the storage substrate decision (Postgres tables vs graph DB vs extension), see the Graph DB vs Postgres ADR companion doc. The summary: at depth-2 with aggregation-heavy read patterns, well-indexed Postgres tables outperform any graph DB option.
4. Integration status
Google Workspace Mostly live
| Component | Status | Notes |
|---|---|---|
| OAuth client, GoogleAccount, token encryption | Live | lib/google/oauth_client.rb |
Scopes gmail.metadata + calendar.readonly | Declared | Not yet ingested; already in GoogleAccount::GOOGLE_SCOPES |
| People API contact sync, daily cron | Live | Shipped Dec 2025 |
| Admin-portal Google card, OAuth flow, polling | Live | useIntegrationsPage hook |
| Per-scope UI toggles | Build | Extend GoogleIntegrationCard |
| Gmail + Calendar metadata clients | Build | New files under lib/google/ |
| Syncers, schedulers, workers | Build | Mirror existing ContactsSyncer pattern |
| Newsletter / bulk inbound filter | Build | Header inspection + recipient count |
| CircuitBox wrap retrofit | Build | Gap in existing integration |
Circuitbox.circuit(:name, ...).run { β¦ } β no changes to business logic, just a guard layer. One named circuit per provider (:google_people, :google_gmail, :google_calendar, :ms_graph_mail, :ms_graph_calendar, :findem_profile) so one provider's outage doesn't cascade to others.Microsoft 365 Greenfield
signInAudience: "AzureADMultipleOrgs".| Component | Status |
|---|---|
| Azure AD app registration (multi-tenant) | Build + ops |
MicrosoftAccount model + migration | Build |
lib/microsoft/graph_client.rb base | Build |
| Mail + calendar metadata clients | Build |
| OAuth callback controller + refresh flow | Build |
| Syncers, schedulers, workers for both | Build |
| Admin-portal Microsoft card + RTK Query service | Build |
| Env vars, feature flag, egress registry entry | Build + ops |
Findem Pending capabilities
| Capability | Consumer | Status |
|---|---|---|
| F1 Β· Person lookup by email | Dedup resolver | Ask Findem |
| F2 Β· Enrichment by LinkedIn handle | Contact enrichment | Live |
| F3 Β· Enrichment when only email known | Fallback contact enrichment | Ask Findem |
| F4 Β· User enrichment | New UserEnrichedProfile pipeline | Ask Findem |
| F5 Β· Company details (size, stage) | Work-overlap gate | Partial |
| F6 Β· Investor / cap-table data | Investor-overlap signal (future) | Ask Findem |
| F7 Β· Webhooks on profile updates | Re-trigger rollups | Framework |
5. Service layer
Rule engine + scorer
The scorer is stateless: given a User and a Contact, it reads the rollup + overlap caches and returns a deterministic result.
Rule row β no changes to the evaluator. Makes the V2 admin-tunable-rules work a straightforward extension instead of a refactor. See DR-04.Result = Struct.new(:tier, :score, :reasons)
# tier: :warm | :known | :cold
# score: Float, for within-tier sort only
# reasons: [{ code: :twoway_12mo_5plus, met: true, detail: "7 in / 9 out" }, ...]
class RelationshipStrengthService
def self.call(user:, contact:)
stats = UserContactInteractionStats.find_by(user: user, contact: contact)
overlap = ContactConnection.work_overlaps.for_pair(user.contact_id, contact.id)
edu = ContactConnection.kind_education_overlap.for_pair(user.contact_id, contact.id)
evaluated = RULES.map { |rule| rule.evaluate(stats, overlap, edu) }
tier = %i[warm known cold].find { |t| evaluated.any? { |r| r.tier == t && r.met } } || :cold
score = WEIGHTED_SCORE.call(evaluated)
Result.new(tier, score, evaluated.select(&:met))
end
end
Rules are declared as data, not code:
RULES = [
Rule.new(code: :twoway_12mo_5plus, tier: :warm, signal_weight: 3.0,
predicate: ->(s, _, _) { s && s.email_inbound_count_12mo >= 5 && s.email_outbound_count_12mo >= 5 }),
Rule.new(code: :sustained_2yr, tier: :warm, signal_weight: 2.5,
predicate: ->(s, _, _) { s && s.email_active_quarters_consecutive >= 8 }),
# ... 14 more ...
]
Adding a new signal = adding a Rule row. No changes to the evaluator.
Reachability service
Pure aggregation β runs once per (contact, collection) whenever any UCC tier in that collection changes.
class ReachabilityService
THRESHOLDS = {
high: ->(c) { c[:warm] >= 1 || c[:known] >= 3 || c[:cold] >= 10 },
medium: ->(c) { c[:known] >= 1 || c[:cold] >= 5 },
low: ->(c) { c[:cold] >= 1 }
}.freeze
def self.call(contact:, collection:)
uccs = UserContactCollection.where(contact: contact, collection: collection)
counts = uccs.group(:strength_tier).count.symbolize_keys
tier = THRESHOLDS.find { |_, pred| pred.call(counts) }&.first || :none
key = uccs.order(strength_tier_ordinal: :desc, strength_score: :desc).first&.user_id
ContactReachability.upsert({ contact_id: contact.id, collection_id: collection.id,
tier: tier, key_user_id: key, **counts, computed_at: Time.now })
end
end
Rollup worker
Two entry points:
- Incremental: on every
InteractionEventinsert viaafter_commit, bump counters and update the activity bitmap. O(1) per event. - Nightly full: recompute all rows touched in the last 24h. Catches drift, re-bucketizes aging windows (e.g. emails that just fell outside the 12-month window).
Findem enrichment resolver
Lookup β cache β Contact match. Conservative: ambiguous Findem responses leave contact_id null.
EnrichedEmailLookup cache absorbs ~99% of calls after day 1. Typical user: ~800 Findem calls on backfill, <10/day steady state. See DR-03 and the FAQ.contact_email = bob@acme.com]) --> Norm[Normalize email
lowercase + trim] Norm --> Cache{EnrichedEmailLookup
cache hit?} Cache -->|Hit| Apply Cache -->|Miss| Findem[Findem F1 lookup] Findem --> Amb{Confidence >= threshold?} Amb -->|No| LeaveNull[Leave contact_id NULL
store neg cache 7d] Amb -->|Yes| Store[Store canonical:
linkedin_handle,
known_emails] Store --> Apply[Apply to Getro] Apply --> ByHandle{Contact by
linkedin_handle?} ByHandle -->|Found| Set[Set contact_id
on InteractionEvent] ByHandle -->|None| ByEmail{ExistingContactsByEmailsFinder
on known_emails?} ByEmail -->|Found| Set ByEmail -->|None| LeaveNull LeaveNull --> Done([Done β may resolve later
via Contact create hook]) Set --> Done classDef warn fill:#fef3c7,stroke:#b45309 classDef ok fill:#dcfce7,stroke:#15803d class LeaveNull warn class Set ok
6. Caching architecture
This section explains how the system's caches are organized, how they're maintained, and why they're shaped the way they are. Two halves: Part A is plain English for product, design, and reviewers; Part B is technical detail for engineers implementing or operating the system.
6.A β In plain English
The system has three things to keep track of, all derived from the same raw data: who shares a job history with whom (for intro paths), who's in regular email or calendar contact (for relationship strength), and the rolled-up answers for whole lists at a time. The challenge is that computing any of those questions from scratch takes too long when you ask for fifty or two hundred answers at once. So we pre-compute and store the answers, then update the stored answers whenever the underlying data changes. That's the whole idea.
There are three kinds of stored answers. The first kind, which we call per-pair caches, holds answers shaped like "for this team member and this contact, here's what we know β they've exchanged twelve emails in the last year, they had two meetings in the last six months, they once worked together at Stripe." There's a separate cache for each kind of signal: one for email, one for work overlap, one for education overlap, one for shared investments. Each cache is small, narrow, and maintained independently from the others.
The second kind, which we call edge tables, holds shortcuts shaped like "Mike and Sarah worked at Stripe together from 2020 to 2022." This is what 007 reads when it needs to find intro paths between contacts. It's structurally similar to the per-pair caches but keyed differently β it's about two contacts, not about a team member and a contact.
The third kind, which we call list rollups, are the at-a-glance summaries. For each list a user has, and each company on that list, we store one row that says "you have three direct contacts here, twelve intro paths, strength score 42, most recent signal April 12." This is what the company-list page reads when it loads β one indexed query, no fan-out. Without this rollup, the list page would have to walk through every other cache, every time, and the page would never finish loading.
When something changes in the underlying data β a contact updates their job, a new email comes in, someone joins your shared list β a small Rails hook fires immediately after the change is committed. That hook doesn't do the work itself; it just queues a background Sidekiq job for each cache that might need updating. There's one job per cache because each cache is independent. The work happens out of band, so the user who triggered the change doesn't wait. Each job is also deduplicated, so if ten emails arrive in a second, we only run one rollup recomputation, not ten.
When a page loads, it reads only from the caches and rollups, never from the raw underlying data. The list view hits the rollup table once and gets back fifty rows, one per company, with everything it needs to render. A contact-detail page hits two or three of the per-pair caches. Drilling into a specific company opens up the edge table to enumerate the actual paths. None of these reads ever touch the original work-history or email tables β they're far too slow to be on a hot path.
Two things keep the system honest. First, every job is idempotent and version-checked, so if a worker dies mid-write, it can safely re-run from where it stopped. Second, a nightly reconciliation job sweeps through each cache and double-checks it against the source data, fixing any drift. So if a Sidekiq job genuinely failed and we missed an update, we catch it within twenty-four hours and the cache heals itself.
The reason we use this many small, narrow caches instead of one big universal one is failure isolation. If the work-overlap calculator has a bug, only the work-overlap cache is wrong; the email signal cache keeps working. If we want to rebuild the investor cache from scratch, we don't have to touch anything else. Each heuristic gets its own little pipeline, its own worker, its own reconciliation, and the consumers stitch the answers together at read time. The trade-off is that one underlying change can fan out to several different jobs, but those jobs are all small and quick, so the total cost is fine.
6.B β Technical detail
6.B.1 The three layers
Every cache in the system fits into one of three layers, distinguished by key shape and consumer:
| Layer | Key shape | Examples | Read consumer |
|---|---|---|---|
| Layer 2 β Per-pair (user Γ contact) | (team_user, contact) | Future per-user rollup tables (e.g., UserContactInteractionStats) β not yet built. For now, computed on read by joining UserContactCollection through contact_connections. |
008 strength rule engine; per-contact strength chips on contact-detail pages |
| Layer 3 β Cross-pair edges | (contact_a, contact_b, kind, bridge_entity) | One unified contact_connections table per ADR-007-B. Each row carries a kind enum: work_overlap, education_overlap, investment_overlap, board_peer, email_exchange, meeting, linkedin_connection. Today only work_overlap is populated. |
007 drill-in; intro-path enumeration on contact-detail and company-detail; DeepFinder 3-hop spike API |
| Layer 4 β Per-list rollup | (list_id, organization_id) or (contact_id, collection_id) | SharedListNetworkSummary, ContactReachability β not yet built; planned for Phase 5. |
007 list-view; 008 reachability column on collection contact lists |
(Layer 1 is source data β InteractionEvent, ContactWorkExperience, ContactEducation, UserContactCollection. Never read on hot paths.)
The layers are read independently β a contact-detail page hits Layer 2 only, a list view hits Layer 4 only β and written together via fan-out from Layer 1 source events.
6.B.2 Write path: source event β fan-out to caches
Per ADR-007-B, all kinds of edges write to the single contact_connections table. There is one Sidekiq worker class per kind (not per table); each worker is responsible for upserting rows of its kind into the unified table. Today, only WorkOverlapEdgeWorker is implemented β other kinds are placeholders for future phases.
change]:::src IE[InteractionEvent
insert
planned]:::src CE[ContactEducation
change
planned]:::src UCC[UserContactCollection
insert]:::src WE -- after_commit --> WW{{WorkOverlapEdgeWorker
shipped}}:::worker IE -- after_commit --> IW{{EmailExchangeRollupWorker
planned}}:::worker IE -- after_commit --> MW{{MeetingRollupWorker
planned}}:::worker CE -- after_commit --> EOW{{EducationOverlapEdgeWorker
planned}}:::worker UCC -- after_commit --> SUM{{ListSummaryRollupWorker
planned}}:::worker WW --> CC[(contact_connections
kind=work_overlap)]:::cache IW --> CC2[(contact_connections
kind=email_exchange)]:::cache MW --> CC3[(contact_connections
kind=meeting)]:::cache EOW --> CC4[(contact_connections
kind=education_overlap)]:::cache CC -. triggers downstream rollup .-> SUM CC2 -. triggers downstream rollup .-> CR[ContactReachability
planned]:::rollup SUM --> SLNS[SharedListNetworkSummary
planned]:::rollup
Write semantics:
- Trigger: Rails
after_commithook on the source-table model. The hook never blocks the user-facing request; it only enqueues Sidekiq jobs.ContactWorkExperiencehook is shipped today (app/models/contact_work_experience.rb) and fires on create / relevant column updates / destroy. - One worker class per kind, all writing to the same table.
WorkOverlapEdgeWorkerupsertscontact_connectionsrows ofkind=work_overlap; futureEmailExchangeRollupWorkerwill upsertkind=email_exchange, etc. Workers are independent (different queues, retry policies) but share the substrate. A bug in one kind's writer doesn't corrupt another kind's rows because they're partitioned by thekindcolumn. - Idempotency: each worker delegates to its kind's backfill service (e.g.,
WorkOverlapBackfillService) which performs an UPSERT withrecord_timestamps: true, preservingcreated_aton conflict and refreshingupdated_at+ the domain-levelcomputed_at. - Deduplication:
sidekiq-unique-jobswith:until_executedon(worker_class, contact_id). A burst of N CWE writes for one contact during enrichment collapses to one job. - Cascading rollups: when an edge write completes, it enqueues the downstream Layer-4 rollup (planned). The rollup is also debounced β a burst of edge writes for one (list, org) collapses to one rollup compute.
6.B.3 Read path: page-load queries
Each user-facing page hits a specific layer. Not every layer is needed for every page.
| Page / endpoint | Layer hit | Query shape | p99 budget |
|---|---|---|---|
| 007 list view (Network Connections tab) | Layer 4 only | Single SELECT from SharedListNetworkSummary with WHERE/ORDER BY on rollup columns |
50 ms |
| 007 drill-in (one company's paths) | Layer 3 + Layer 4 | SELECT from contact_connections WHERE kind IN (...); bounded by org scope |
200 ms |
| 008 reachability column on collection list | Layer 4 only | SELECT from ContactReachability joined on UCC |
50 ms |
| 008 strength tier on contact-detail card | Layer 2 + UCC additive cols | SELECT from UCC + per-pair caches; small N |
100 ms |
| 008 strength rule re-evaluation (admin tool) | Layer 2 only | Reads contact_connections rows for one (user, contact) pair across kinds; runs rule predicates |
150 ms |
Existing per-contact connection paths (007 Finder, 2 hops, shipped) |
Layer 1 + Layer 3 | CTE over contact_work_experiences + UCC; can short-circuit via contact_connections when populated |
2 s (existing budget; under 200 ms once contact_connections is populated) |
3-hop deep paths (007 DeepFinder, spike, shipped) |
Layer 1 (UCC seeds) + Layer 3 | Target-rooted backward BFS over contact_connections (kind=work_overlap) joined to UCC.shared seeds; bounded by per-hop edge cap. See ADR-007-A note below. |
250 ms (statement_timeout); empirical: ~30β200 ms typical, ~7 s p99 on hyper-connected targets |
None of the hot-path queries touch Layer 1 directly. Layer 1 is only read by workers (write path) and by reconciliation jobs.
6.B.4 Reconciliation: keeping caches honest
Each cache has a paired reconciler Sidekiq job, scheduled nightly via cron. The reconciler walks the cache's source rows in batches, recomputes the cache row deterministically, and either confirms or corrects the stored value. Drift is logged with a counter; if drift exceeds a threshold for any cache, on-call gets paged.
- Version columns: each cache row carries
computed_atand asource_versionhash. The reconciler compares the source version against the cache version to detect skew. - Bounded batch size: reconcilers process N rows per Sidekiq job (default 1,000) and re-enqueue with a cursor for the next batch. Doesn't lock tables.
- Idempotent: running the reconciler twice produces the same result. Safe to trigger manually after a deploy or after a backfill.
- Per-cache scheduling: each reconciler runs on its own cron entry. A long-running work-overlap reconciler doesn't block the email-stats reconciler.
6.B.5 Failure isolation
The architecture's main reliability property is that one broken heuristic doesn't corrupt the others. This is enforced at three levels:
- Worker level: each kind has its own worker class, its own Sidekiq queue, and its own retry policy. A bug in
WorkOverlapEdgeWorkerleaves the futureEmailExchangeRollupWorkerentirely unaffected. Both write tocontact_connectionsbut to disjoint subsets of rows partitioned by thekindcolumn. - Reconciliation level: each kind has its own nightly reconciler scoped to
WHERE kind = '...'. A regression in one kind's logic shows up as drift in only that subset; on-call investigates one kind at a time. - Read level: rollup queries treat missing edges as zeros (or null tier), not as errors. If no rows exist for a (user, contact) pair with
kind=investment_overlap, strength rules fall through to the next clause that doesn't depend on it. The user sees a slightly less-rich answer; the page still loads.
6.B.5a DeepFinder β target-rooted backward BFS over contact_connections
The 3-hop spike API (GET /api/v2/collections/:id/contacts/:id/deep_connection_paths) is implemented by Contacts::ConnectionPaths::DeepFinder. Algorithm note worth recording because the naive forward walk does not scale and the choice is not obvious from the table schema:
- Forward walk (rejected): start at every UCC.shared seed in the collection, walk N hops via
contact_connections, filter post-hoc to paths terminating at the target. Frontier grows as O(seed_count Γ branching^N). On seeded data (24k contacts, 27k seeds, ~25 edges/contact), depth=3 produces 5.4M intermediate walk rows of which 99.98% are discarded by the target filter. Wall clock: ~13 s, dominated by recursion fan-out. - Target-rooted backward walk (shipped): start at the target, walk Nβ1 hops outward via
contact_connections, then INNER JOIN the terminal contacts against UCC.shared seeds. Frontier grows as O(branching^N), independent of seed count. Same workload: ~900 walk rows, ~10 ms β a ~1700Γ speedup with bit-identical result set. - Per-hop edge cap: the recursive case uses
INNER JOIN LATERAL ... LIMIT MAX_EDGES_PER_HOP=25, ordered bylast_signal_at DESC, to bound worst-case fan-out for hyper-connected targets (a contact with 500+ edges would otherwise expand to 125M frontier rows at depth=3). - Statement timeout: 250 ms via
SET LOCAL statement_timeout, returningfailure(error: "deep_finder.timeout: ...")rather than blocking the request. - Implication for ADR-007-A: the original ADR claim that "Postgres recursive CTE handles depth=3 acceptably" is true only with target-rooted BFS. The naive forward-walk approach times out on seeded data. Substrate choice (Postgres vs graph DB) was correct; algorithm choice was load-bearing.
6.B.6 Trade-offs
| Property | Choice | What we gain | What we give up |
|---|---|---|---|
| Caching at all | Yes, mandatory for hot paths | 50 ms list-view p99 vs hours of compute | Storage (~5β25 MB per collection); freshness lag (seconds to minutes) |
| One worker per kind, one shared edge table | Per-kind workers all writing to contact_connections partitioned by kind enum (ADR-007-B) |
Failure isolation by kind; per-kind reconciliation; per-kind rebuild; one schema migration covers all future signal types | Write amplification (one source change β N worker jobs); single hot table for all kinds (mitigated by kind-specific indexes) |
| Layer 4 rollups | Pre-aggregate the list-view answer at write time | List-view query is one indexed read | Rollup must be re-derived when ranking weights change (column add + backfill) |
| Per-pair vs per-edge keying | Two parallel artifacts for the same kernel | Each consumer reads with its own optimal key shape | Two write paths from one source event (acceptable; both are cheap) |
| Cold-path heuristics | Don't cache; compute live | No storage cost, no freshness lag, no worker | Higher latency on the rare reads (acceptable β they're rare) |
6.B.7 Storage and write-amplification numbers
Concrete sizing for a collection with ~10,000 shared contacts and ~5 active team members. Indexes ~3Γ row size; total figures include indexes. All "cache" rows below live in the unified contact_connections table partitioned by kind (per ADR-007-B); the Cache column names them by their conceptual role.
| Cache (kind / role) | Rows per collection | Storage | Writes per source event |
|---|---|---|---|
Per-user Γ contact rollup (planned, not yet built β likely separate UserContactInteractionStats table) |
~50,000 (5 users Γ 10k contacts) | ~30 MB | 1 UPSERT per InteractionEvent insert |
contact_connections WHERE kind = 'work_overlap' (shipped) |
~5,000β100,000 (one per (contact_a, contact_b, org) triple with overlap; collapsed across multiple stints by the backfill) | ~3β50 MB | 1βN UPSERTs per ContactWorkExperience change, deduped by Sidekiq lock per contact |
contact_connections WHERE kind IN ('email_exchange', 'meeting', 'education_overlap', ...) (planned) |
varies by signal density | ~5β50 MB per kind | 1 UPSERT per source event after dedup |
SharedListNetworkSummary (planned) |
~50 (one per org on the list) | <1 MB | 1 UPSERT per affected (list, org) pair |
ContactReachability (planned) |
~10,000 (one per contact in collection) | ~5 MB | 1 UPSERT per UCC strength-tier change |
Total per active collection (when all kinds populated): ~50β150 MB on disk including indexes. Write-amplification: a single ContactWorkExperience change triggers ~1 worker job per touched contact (Sidekiq lock collapses bursts), upserting ~5β20 contact_connections rows in <500 ms.
6.B.8 When to skip caching for a heuristic
Not every heuristic needs a cache. Skip the cache and read live from source tables when all three are true:
- The heuristic is only read on cold paths (admin tools, drill-ins, one-off audits).
- The live query takes <500 ms with proper indexes.
- The query is bounded (single contact, single pair, single org β never a full-collection scan).
Examples that meet all three: per-contact strength rule re-evaluation in admin tools, single-org drill-in for connection paths, debug queries. Building caches for these adds storage and worker complexity without earning meaningful read latency back.
6.B.9 Operational gaps (known issues, not yet addressed)
Three gaps surfaced by the work_overlap spike that the architecture above describes correctly in principle but the shipped code doesn't handle yet. Worth documenting so they're not silently inherited by future kinds.
-
Ghost edges on destroy.
WorkOverlapBackfillServiceusesupsert_allβ it inserts new edges and updates existing ones, but neverDELETEs. When the onlyContactWorkExperiencelinking contacts A and B at organization Z is destroyed, the correspondingcontact_connectionsrow persists as a stale "edge to nowhere." TheContactWorkExperience#after_commitdestroy hook re-runs the backfill scoped to that contact, but the backfill's INSERT path can't observe pairs that no longer have any source rows. Fix path: extend the per-contact backfill to compute the set of (pair, org) tuples currently incontact_connectionsfor that contact, diff against the freshly-computed set from CWEs, andDELETEthe difference. ~30 LOC; not done. -
Initial production backfill is unbounded. The
contact_connections:backfill_work_overlaprake task scans every CWE pair in the database when called without scoping. On dev (71k CWEs) this took 30s. Production has multiple orders of magnitude more rows; the scan would lock CPU on the writer for hours and the result-set could exceed memory. Fix path: chunked backfill driven byMaintenance::Task(the Shopify maintenance_tasks gem already in use), iterating per-organization or per-N-thousand contact-IDs. Resumable, observable in maintenance-tasks UI, throttleable. Ship before the table is enabled in any production environment. -
Reconciliation worker described but not implemented. Β§6.B.4 above promises "each cache has a paired reconciler Sidekiq job, scheduled nightly via cron." For
kind=work_overlap, that worker does not exist. The after_commit hook is the only edge-write path today; a missed Sidekiq enqueue (Redis flap, deploy race) leaves a permanent gap until the next manual backfill. Fix path: schedule the chunked backfill from gap #2 as a nightly cron filtered tocontact_work_experiences.updated_at > last_run; emit a drift counter when the reconciler updates a row whose timestamps are older than the threshold. Required before any production rollout that depends on freshness SLOs.
6.B.10 Sister-Finder integration gap (production read path still bypasses contact_connections)
The shipped 2-hop production endpoint, GET /api/v2/collections/:id/contacts/:id/connection_paths, is served by Contacts::ConnectionPaths::Finder. That service still reads from contact_work_experiences joined to itself by organization_id β it does not read from contact_connections. Until Finder is migrated, the unified table only serves the new 3-hop spike endpoint and cross-kind reads from the strength engine; the most-trafficked read path in the system is unaffected.
This is a quiet but load-bearing gap because:
- The "this table replaces six separate caches" claim in Β§3.1 is design-true but production-false. Two-hop reads for every list-view drill-in still pay the recursive-CTE-over-CWE cost on every call.
Finderhandles signal kinds thatcontact_connectionsdoesn't carry yet βlinkedin_connectionandgoogle_connectiondirect paths come fromUserContactCollection.sourceintrospection. MigratingFinderrequires either (a) extendingcontact_connectionsbackfill to include those kinds, or (b) keeping the UCC-source lookup as a parallel direct-path layer.FinderappliesSIGNAL_WEIGHTranking, consecutive-stint tolerance, and parent-source enrichment.DeepFinderintentionally does not. Migration is not a one-line swap; it's a several-week shadow-mode rollout.
Suggested migration sequence:
- Build backfills for the missing kinds (
linkedin_connection,google_connection) β Phase 9 territory in the rollout plan, but should be promoted earlier ifFindermigration is a goal. - Carry over
SIGNAL_WEIGHT,CONSECUTIVE_OVERLAP_TOLERANCE, andDIRECT_SIGNAL_TYPEmappings intoDeepFinder(or a newUnifiedFinderthat subsumes both). - Shadow-mode the new finder behind a flag β call both for every request, compare result sets, log divergence.
- Once shadow shows β₯99% parity for >1 week, swap the controller binding. Keep
Finderin the codebase as fallback for one release cycle. - Delete
Finder+ the CWE-self-join code path.
Estimated effort: 2β3 phases, not a single PR. Until this lands, the spec's "unified ContactConnection model" claim should be read as "unified for new consumers; legacy 2-hop reader still uses the old shape."
The bottom line on caching
Cache aggressively for the list-view and rollup queries (impossible without it). Cache selectively for per-pair queries that hit hot paths (UCC strength chips, reachability column). Don't cache cold-path queries (admin, drill-in, audits) β pure functions over source tables are fine. Each heuristic is independent at every layer: own worker, own reconciler, own version pin, own failure domain.
7. Operational envelope
Quantitative choices, rate limits, and volume estimates β the numbers reviewers ask about when deciding whether this will scale or what it will cost.
Findem cache
- Positive TTL
- 90β180 days
- Negative TTL
- 7β30 days
- Day-1 calls/user
- ~800 (one per unique correspondent)
- Steady-state
- 5β10 calls/day/user
- Invalidation
- Findem F7 webhook on profile update
Backfill horizon
- On OAuth connect
- 24 months (see DR-08)
- Older history
- On-demand "load historical" job (future)
- First-sync duration
- ~minutes, not hours
Gmail / Google Calendar
- Rate limit
- 600 req/min/user; 1B units/day/project
- Sync cadence
- Daily, 6am UTC
- Incremental
- historyId (mail), syncToken (calendar)
Microsoft Graph
- Rate limit
- 10k req / 10min / user
- Sync cadence
- Daily, staggered from Google
- Incremental
- Delta query tokens
Interaction volume (typical user)
- Day-1 events
- ~20,000 (24-month backfill)
- Steady-state
- ~50 events/day
- Row size
- ~500 bytes
- 10 users Γ 2 years
- ~1.5 GB raw events
CircuitBox (all external clients)
- error_threshold
- 50%
- time_window
- 60s
- volume_threshold
- 5 requests
- sleep_window
- 120s before half-open probe
Rollup worker
- Incremental
- after_commit per event, O(1)
- Nightly rebuild
- 00:00 UTC, 36-hour window
- Batch size
- 500 pairs/job
- Uniqueness
- sidekiq-unique-jobs :until_and_while_executed
OAuth tokens
- Google access
- ~1 hour
- Google refresh
- Long-lived, revoked on disconnect
- MS access
- ~1 hour
- MS refresh
- 24h rolling
- Storage
- Encrypted at rest via lockbox
Scoring / reachability
- Scorer complexity
- O(1) per pair β reads one stats row
- Reachability aggregate
- SQL GROUP BY on UCC strength_tier
- Typical cascade on strength change
- ~1β10 reachability recomputes
Privacy guardrails
- Scopes chosen
- gmail.metadata, Mail.ReadBasic β no body access possible
- No calendar titles stored
- attendee list + times only
- Per-user opt-in
- Each scope toggleable independently
- Disconnect wipe
- All InteractionEvents deleted
8. Phased plan
InteractionEvent
+ simple resolver] --> P2[Phase 2
Gmail + GCal syncers
+ scope UI toggles] P1 --> P6[Phase 6
MS Graph integration] P2 --> P3[Phase 3
Rollup worker
+ stats table] P6 --> P3 P3 --> P4[Phase 4
Rule engine
+ Reachability] P4 --> P5[Phase 5
UI surfaces
+ filters] P5 --> V1[[Thin V1 ships
8/16 clauses active
both providers]] V1 --> P7[Phase 7
Findem lookup
+ cache] P7 --> P8[Phase 8
User enrichment] P8 --> P9[Phase 9
Overlap calculators
+ activate clauses] P9 --> Full[[Full V1
14/16 clauses active]] classDef v1 fill:#dcfce7,stroke:#15803d,stroke-width:2px classDef full fill:#ddf4ff,stroke:#0969da,stroke-width:2px class V1 v1 class Full full
| Phase | Scope | Depends on | Size | Slice | Spec |
|---|---|---|---|---|---|
| 1 | InteractionEvent + ContactEmailResolver + backfill hooks | β | S | A | 008 |
| 2 | Gmail + Google Calendar syncers + scope UI + CircuitBox retrofit | 1 | M | A | 008 |
| 3 | Rollup worker + UserContactInteractionStats | 2 | M | A | 008 |
| 4 | Rule engine + reachability + UCC strength columns + override | 3 | M | A | 008 |
| 5 | Contact-detail signals + sortable columns + reachability filters + hover reasons | 4 | M | A | 008 |
| 6 | MS Graph integration (OAuth, mail+cal syncers, admin-portal card) | 1 | L | A | 008 |
| 7 | Findem lookup_by_email + EnrichedEmailLookup cache; swap resolver | Findem F1 | S | B | 008 |
| 7a | 007 v1: contact_connections (kind=work_overlap, shipped) + WorkOverlapEdgeWorker after_commit hook (shipped) + DeepFinder spike (shipped) + CollectionOrgCurrentSharedContact + SharedListNetworkSummary + Network Connections tab UI (planned) | 1 (no email/calendar dep) | L | β | 007 v1 |
| 8 | User enrichment via Findem (F4) | Findem F4 | M | C | 008 |
| 9 | Education overlap edges (contact_connections kind=education_overlap) + edge worker; activate strength clauses for work + education | 8 | M | C | 008 + 007 v2 |
| 10 | (Optional) Investor overlap via F6: contact_connections kinds investment_overlap + board_peer + their edge workers | Findem F6 | M | C+ | 008 + 007 v3 |
| 11 | (Nice) Intro-request auto-draft | 5 | S | β | 008 |
9. FAQ
The sharp questions reviewers already have.
Won't this hammer Findem with thousands of calls per mailbox?
No. The EnrichedEmailLookup cache keys on normalized email β Findem gets called once per unique correspondent, not once per event. Day-1 backfill is ~800 calls; steady state is <10/day. See DR-03.
What happens when a provider is down?
CircuitBox opens around the failing client, the sync worker exits cleanly, a Slack alert fires. New InteractionEvent rows pause for that provider; others are unaffected. Stats stay consistent with what we've seen. No data loss.
Can email data leak across collections?
InteractionEvent is scoped to the mailbox owner. Stats are per (user, contact). Reachability aggregates only the UCCs in one collection β a user in Collection A and B shares interaction data, but each collection only sees its own breadth counts.
What happens when a user disconnects their mailbox?
OAuth token revoked, provider-account row deleted, cleanup worker removes every InteractionEvent for that user. Dependent stats recompute (counts drop to 0). Strength tiers re-evaluate to Cold or null.
How do we prevent newsletters from being mistaken for real relationships?
At ingestion each email is flagged newsletter if any of: List-Unsubscribe header present, Precedence: bulk/list/junk, or >20 recipients. Newsletter-only inbound matches a Cold clause (C2), never Known.
Do we ever create contacts from email data?
No β that was V1's failure mode. If the sender/recipient doesn't resolve to an existing Contact, the InteractionEvent stays with contact_id = NULL and waits. An after_commit hook on Contact create backfills attributable events later.
Does this work for Outlook / Microsoft 365 users?
Yes β building Microsoft Graph in parallel with Google is the whole point. The scoring engine is provider-agnostic: an Outlook user and a Gmail user who know the same contact produce indistinguishable signals.
Can an individual hide their own signals?
V1: no hide-toggle, but they can override their own strength up or down. Disconnecting their mailbox is the escape hatch β contributions fall to Cold/None.
Why compute a numeric score if we don't show it?
Two places it matters: sorting a Warm list by "warmest first", and picking the Key Connection when multiple team members are all Warm. V1 hides it per product call; V2 exposes it.
What are the thin-V1 ship criteria?
Phases 1β6 complete: both providers connected, 8 of 16 heuristic clauses active, reachability + key connection functional, UI renders signals on contact detail + list. Findem dedup (7) and user enrichment (8β9) arrive incrementally.
10. Open questions
Findem capability confirmations
- F1 β person lookup by email: does the endpoint exist? Rate limits, cost per call, response schema, confidence scoring?
- F3 β email-only enrichment: fallback when no handle is known?
- F4 β user enrichment: any product/TOS concern enriching authenticated users vs. external contacts?
- F6 β investor data: does Findem expose company investor / funding rounds?
- LinkedIn degree: does Findem track 1st-degree LinkedIn connections? Long shot; would close the last V1 gap.
Team decisions before execution
- Ship Slice A (thin V1, 50% heuristic activation) as the first customer-facing release, or wait for Slice C?
- "Sustained at lower volume" threshold β recommend 8 of last 8 quarters continuous. Alternative: 6 of 8.
- Reachability editable? Recommend no in V1 β it's a pure aggregation, editing would create drift.
- Strength-override audit: when a user overrides, keep a record of what the rule engine would have said (for future rule tuning)?
- Backfill horizon: 24 months default. Confirm.
- Browser-extension LinkedIn capture: worth a short spike to see if 1st-degree data is already being scraped.
11. File references
Key existing Getro code to extend or mirror.
Backend
| Purpose | Path |
|---|---|
| Google OAuth client | backend/lib/google/oauth_client.rb |
| People API client | backend/app/services/google/people_client.rb |
| GoogleAccount model | backend/app/models/google_account.rb |
| Reference syncer | backend/app/services/contacts/import/google/contacts_syncer.rb |
| Reference scheduler | backend/app/workers/schedulers/contacts/import/google_contacts_daily_sync_scheduler.rb |
| Contact model | backend/app/models/contact.rb |
| ContactEmail | backend/app/models/contact_email.rb |
| Dedup intake | backend/app/services/contacts/contact_creator.rb |
| Dedup lookup | backend/app/services/contacts/existing_contact_finder.rb |
| Merge with audit | backend/app/services/contacts/merge_service.rb |
| UCC (extend) | backend/app/models/user_contact_collection.rb |
| Findem base | backend/lib/findem/client.rb, backend/lib/findem/apis/ |
| CircuitBox config | backend/config/initializers/circuitbox.rb |
Admin portal
| Purpose | Path |
|---|---|
| Integrations page (per-user) | admin-portal/src/pages/Settings/integrations/ |
| Google OAuth flow hook | admin-portal/src/pages/Settings/integrations/hooks/useIntegrationsPage.jsx |
| Google RTK Query service | admin-portal/src/services/userGoogleAccountsV2.js |
| Path-card atoms (reuse for signals) | admin-portal/src/pages/listDetail/networkConnections/components/pathCard/ |
| Contact detail page | admin-portal/src/components/organisms/contactDetail/ |
| Contact list views | admin-portal/src/pages/contactsExtended/ |
12. Decision records
Short ADRs for the non-obvious choices landed during this spike. The "why we picked this" future maintainers will ask.
- Context
- Two providers, two data kinds. A naive model would split into
GmailInteraction,OutlookInteraction,GoogleCalendar,OutlookCalendar. - Decision
- One
InteractionEventtable; provider + kind are enum columns. - Consequences
- Scoring is provider-agnostic by construction. Adding a third provider (e.g. Slack DMs later) is an enum value, not a new pipeline.
- Alternatives
- Per-provider tables β rejected, doubles the scoring code path and couples tier logic to source.
- Context
- V1 G Suite integration created duplicates because Getro's
Contactis Collection-scoped β same email in two collections is two rows. - Decision
- Resolver passes every unresolved email through Findem's canonical-identity lookup, then matches to Getro
Contactvia globally-uniquelinkedin_handlefirst, thenContactEmail. - Consequences
- No schema change to
Contact. The identity problem moves outside Getro to Findem's graph. - Alternatives
- Global Contact primary key (breaks Collection isolation); per-collection fallback lookup (doesn't close the cross-collection dup gap).
- Context
- Per-event calls would cost thousands per user per day and trip rate limits.
- Decision
EnrichedEmailLookupcache fronts every Findem call. TTL 90β180d positive, 7β30d negative.- Consequences
- ~800 calls per user on backfill, <10/day steady state. Cache invalidation via Findem F7 profile-update webhook.
- Context
- Team thread split on rules-vs-weights. Admins want explainable tiers; product wants within-tier ranking.
- Decision
- Rules assign the tier (first-match-wins). Weighted score computed independently, used only for within-tier sort + Key Connection tiebreak.
- Consequences
- V1 tiers are explainable ("you matched W1 and W4"). V2 exposes weights for admin tuning.
- Alternatives
- Pure weighted score with hard thresholds β rejected, not explainable.
- Context
- Waiting for Findem confirmations blocks ship. Email + calendar alone cover 50% of clauses.
- Decision
- Phases 1β6 ship without Findem or user enrichment. Findem (Phase 7) and user enrichment (8β9) land incrementally, no data migration required.
- Consequences
- Dedup coverage is ~40β60% on Slice A vs. ~75β85% with Findem. The architecture is additive at every stage.
- Context
- Strength is per (user, contact).
UserContactCollectionalready models that link. - Decision
- Add 4 nullable columns to UCC:
strength_tier,strength_score,strength_override,strength_computed_at. - Consequences
- No new join for every strength read. Migration is purely additive.
- Alternatives
- Separate
UserContactStrengthtable β rejected, one-to-one with UCC with no justification.
- Context
- Reachability aggregates strength tiers. If users edit it, it drifts from its own inputs.
- Decision
- V1: strength is overridable (per-user), reachability is not. V2: revisit if users request it.
- Consequences
- Single source of truth is UCC strength. Reachability always derivable from current state.
- Context
- Unbounded backfill is slow and expensive. The "sustained 2-year" heuristic defines the practical signal floor.
- Decision
- Read last 24 months of email + calendar on first connect. Older history via explicit "load historical" button (future).
- Consequences
- First sync completes in minutes. Signals stabilize within 24h of connection.
- Context
- Windowed counters (
email_count_12mo,has_response_in_3mo) need to forget events as they age out. Incremental can add but can't drop. - Decision
- Incremental path on event insert (O(1) bump). Nightly full rebuild sweeps the aging tail + reconciles drift.
- Consequences
- Two code paths for the same stats. Incremental is fast; nightly is safe. If incremental ever lags, nightly heals.
- Context
- 007's list view is an aggregation problem (50 orgs Γ 10k contacts Γ employees) with a 50ms p99 budget. Drill-in is a fixed 2-hop traversal with time-windowed bridges. We considered Neo4j, TigerGraph, Apache AGE, live JOIN, and Postgres recursive CTEs.
- Decision
- Use Postgres tables maintained by Sidekiq workers via Rails
after_commithooks, with nightly reconciliation. One unifiedContactConnectiontable with akindenum (work_overlap | education_overlap | investment_overlap | board_peer | email_exchange | meeting | linkedin_connection) replaces what an earlier draft of this ADR called separate per-type tables. Aggregation rollups (SharedListNetworkSummary,ContactReachability) stay in their own tables since they're consumer-specific. ReuseContacts::ConnectionPaths::Finderkernel logic, parameterized over kind. - Alternatives considered
-
- Live JOIN at query time β rejected. p99 β 250ms; degrades non-linearly with collection size; blows the 50ms list-view budget.
- Apache AGE (Postgres openCypher extension) β rejected. No measurable performance gain at depth 2; immature ecosystem; doubles query-language surface area.
- Neo4j or other dedicated graph DB β rejected. Cross-DB ETL pipeline; weaker at GROUP BY aggregations; network hop alone exceeds the latency budget.
- Postgres recursive CTEs only β partial accept. Fine for drill-in fallback; insufficient for list view because per-render fan-out is too expensive.
- Consequences
- Zero new infrastructure. List-view p99 β 30ms (v1) / ~30ms (v2). Drill-in p99 β 80ms (v1) / ~150ms (v2 UNION ALL). Storage per collection β 5 MB (v1) / 15β25 MB (v2). Schema is forward-compatible with v2/v3 expansion. Recursive CTE drill-in degrades if depth ever exceeds 2 β including 3+ hop chains across mixed edge types.
- Reversibility
- High. The precomputed tables are a cache layer over the existing schema. If a graph DB is later justified, add it as a parallel store and migrate query paths feature-by-feature.
- Revisit triggers
- 3+ hop traversal as a product feature; first-class social edges becoming dense and primary; centrality / community detection on the roadmap; interactive shortest-path features. See companion doc Graph DB vs Postgres ADR Β§11.6 + Β§12.
- Context
- Two layers of state need maintenance: per-pair connection rows (the
ContactConnectiontable) and per-list rollups (SharedListNetworkSummary,ContactReachability). For the connection table itself, two organizational choices: split into per-kind tables (separate worker per table) or unify into one polymorphic table (worker dispatches on kind). - Decision
- Unify into a single
contact_connectionstable with akindenum. Per-kind workers still exist (one Sidekiq worker class per kind:WorkOverlapEdgeWorkershipped today;EmailExchangeRollupWorker,MeetingRollupWorker, etc. planned) but they all write to the same table with their respective kind value. Each worker triggered byafter_commiton its source table (e.g.,ContactWorkExperiencefor the work-overlap kind); deduplicated viasidekiq-unique-jobs; reconciled by its own nightly cron filtered toWHERE kind = '...'. - Consequences
- + Adding a new connection type is an enum addition + new worker class β no schema migration. + Cross-kind queries ("all signals for pair (A,B)") are a single primary-key lookup. + Strength rule engine reads one query per pair and dispatches on kind in code. + Fewer tables to maintain, back up, monitor. β Indexes are slightly coarser than per-kind tables; mitigated by composite indexes on
(contact_a_id, kind)and(contact_b_id, kind). β Per-kind reconciliation requiresWHERE kind = '...'filter; not a problem with proper indexes. β Some kinds use sparse columns (e.g.,date_fromonly meaningful for time-bound kinds). Acceptable trade-off for the schema simplicity. - Alternatives
-
- Per-kind tables (an earlier draft of this ADR) β rejected. Adding a new kind required a new table + worker + migration. Cross-kind queries needed UNION ALL. Schema churn at v1 β v2 β v3 was painful.
- EAV-style key-value table β rejected. Excessive metadata, no schema validation, query writes become awkward.
- Polymorphic worker with kind dispatch (one worker handling all kinds) β rejected. Couples failure domains; one bug in email rollup blocks work-overlap rebuilds.
- Reversibility
- Medium. Splitting
ContactConnectionback into per-kind tables would require a migration that reads each kind into its dedicated table. ~1 week of work plus careful in-flight-job handling. Reverse direction (per-kind β unified) is easier. - Revisit triggers
- If write amplification ever becomes a measurable bottleneck (target: <500ms total fan-out per source change); if a specific kind reaches such different scale or query patterns that hosting it in its own table earns its keep. Currently no kind is anywhere near that threshold.