GET-109 ยท Provider Integration Deep-Dive

Gmail, Outlook, Google Calendar, Outlook Calendar

Concrete plan per provider โ€” what exists, what extends, what's greenfield, with code skeletons and official doc links. Enough for the team to scope execution and choose complexity vs. UX on shared mailboxes.

1. Overview

Two providers, four metadata sources, one normalized pipeline. The Getro backend already has a live Google Contacts sync we extend; Microsoft 365 is greenfield.

Dimension Google Workspace Microsoft 365
OAuth infrastructure Live ยท Signet + lockbox tokens in GoogleAccount Build ยท Mirror GoogleAccount as MicrosoftAccount
Scope declaration Live ยท gmail.metadata + calendar.readonly already in GOOGLE_SCOPES Build ยท Register scopes in Azure app manifest
Admin portal UI Partial ยท Google card exists, needs per-scope toggles Build ยท Mirror Google card + new RTK Query service
Email sync Build ยท users.messages.list + history.list for incremental Build ยท /me/messages + /me/messages/delta
Calendar sync Build ยท events.list with syncToken Build ยท /me/events with delta query
Shared mailbox Google Groups โœ… free, delegation โš ๏ธ complex Mail.Read.Shared โœ… no admin consent
Rate limits 1B units/day/project ยท 250/user/s Varies per service ยท throttled via 429 + Retry-After
Metadata scope approval Approved โœ… (per product) N/A โ€” Mail.ReadBasic does not require verification
The entire Google side is "extend, not build." The OAuth client, token refresh, daily scheduler pattern, and admin-portal card already exist for contacts. We mirror those files for Gmail metadata + Calendar, and add two new scope toggles to the UI. The Microsoft side is from scratch, but the shape of each file has a Google analogue to follow.

2. Shared infrastructure

Conventions every provider integration will use. Most of these already exist in code โ€” we're just documenting them so the MS mirror knows what to copy.

OAuth token lifecycle

Getro's established pattern, verified in backend/lib/google/oauth_client.rb + backend/app/services/google/people_client.rb:

  • Storage: access + refresh tokens on the provider-account row (GoogleAccount, future MicrosoftAccount). Both columns are encrypts-wrapped by lockbox.
  • Exchange: Faraday POST to /token endpoint. Access + refresh + id_token come back, persisted on the model.
  • Refresh: Signet::OAuth2::Client handles refresh. Called on demand when the API returns 401.
  • Single-retry on 401: every client wraps API calls in a refresh-and-retry loop:
def list_something(account:)
  tried_refresh = false
  begin
    build_service(account: account).list_something
  rescue Google::Apis::AuthorizationError, Google::Apis::ClientError => e
    if !tried_refresh && auth_error?(e) && refresh_access_token!(account: account)
      tried_refresh = true
      retry
    end
    raise
  end
end

Pattern source: backend/app/services/google/people_client.rb:44-73. Microsoft client mirrors this exactly.

CircuitBox convention

Already configured in backend/config/initializers/circuitbox.rb: global Slack notifier hooks on open.circuitbox, close.circuitbox, warning.circuitbox. Not yet applied to Google โ€” a gap we retrofit.

Convention: one named circuit per provider ร— kind.

Circuitbox.circuit(
  :google_gmail_metadata,
  exceptions: [
    Google::Apis::ClientError,
    Google::Apis::RateLimitError,
    Signet::AuthorizationError,
    Faraday::TimeoutError
  ],
  error_threshold:  50,   # percent
  time_window:      60,   # seconds
  volume_threshold: 5,    # min calls before measuring
  sleep_window:     120   # seconds before half-open probe
)

Circuit names: :google_people, :google_gmail_metadata, :google_calendar, :ms_graph_mail, :ms_graph_calendar, :findem_profile. Each isolates its provider's outages.

Scheduler + worker pattern

Verified exemplar from config/cron_schedule.yml:172-174:

google_contacts_daily_sync:
  cron: 'every day at 6am UTC'
  class: Schedulers::Contacts::Import::GoogleContactsDailySyncScheduler

The scheduler fans out one job per eligible account. Workers use sidekiq-unique-jobs (already in Gemfile) for per-account uniqueness:

class GmailMetadataSyncerJob < BaseWorker
  sidekiq_options queue: :contacts_bulk_import_high_priority, retry: 5
  sidekiq_options lock: :until_and_while_executed
  unique_args { |args| [args[0], 'gmail_metadata'] }   # per-account ร— per-type

  def perform(google_account_id)
    account = GoogleAccount.find_by(id: google_account_id)
    return unless account&.has_refresh_token?
    return unless account.has_gmail_metadata_scope?

    Emails::Import::Google::GmailMetadataSyncer.call(account:)
  end
end

Testing convention

VCR cassettes (vcr 6.0 in Gemfile, group :test). Cassettes live in spec/cassettes/. For network calls during tests, webmock is enabled and only VCR-recorded HTTP goes through.

Network Egress Registry

Per backend/docs/integrations.md, every new outbound network dependency adds a row. Template:

| Service                         | Purpose                                   | Auth                 | Env vars                         | CircuitBox |
|---------------------------------|-------------------------------------------|----------------------|----------------------------------|------------|
| Gmail Metadata API              | Email metadata ingestion per user         | OAuth 2.0 user token | GOOGLE_CLIENT_ID, _SECRET        | Yes        |
| Google Calendar Metadata API    | Calendar metadata ingestion per user      | OAuth 2.0 user token | (same)                           | Yes        |
| Microsoft Graph Mail            | Outlook mail metadata ingestion per user  | OAuth 2.0 user token | MS_CLIENT_ID, _SECRET, _TENANT   | Yes        |
| Microsoft Graph Calendar        | Outlook calendar metadata ingestion       | OAuth 2.0 user token | (same)                           | Yes        |

Feature flag convention

Bit flags in features_mask. Existing google_sync is bit 51. New allocations needed:

  • gmail_metadata_sync
  • google_calendar_metadata_sync
  • microsoft_sync (covers both MS mail + calendar)
G

3. Google Workspace

Extend live integration ยท 3 new clients

3.1 Current state (verified in code)

ComponentPathStatus
OAuth client backend/lib/google/oauth_client.rb Live
GoogleAccount model backend/app/models/google_account.rb Live ยท encrypted tokens, scopes JSONB
People API client (contacts sync) backend/app/services/google/people_client.rb Live ยท token-refresh-on-401 pattern
Contacts syncer service backend/app/services/contacts/import/google/contacts_syncer.rb Live ยท sync-token pagination + 410 fallback
Daily scheduler backend/app/workers/schedulers/contacts/import/google_contacts_daily_sync_scheduler.rb Live ยท 6am UTC cron
Admin-portal card + hook admin-portal/src/pages/Settings/integrations/hooks/useIntegrationsPage.jsx Live ยท @react-oauth/google popup flow, 3s polling
RTK Query service admin-portal/src/services/userGoogleAccountsV2.js Live ยท GET/POST/DELETE on /user_google_accounts
CircuitBox wrap on Google โ€” Build ยท gap, retrofit all clients
Scope toggles in UI โ€” Build ยท per-source checkboxes + updated useGoogleLogin scope string
Gmail metadata client backend/lib/google/gmail_metadata_client.rb Build
Calendar client backend/lib/google/calendar_client.rb Build
Gmail / Calendar syncers + jobs + schedulers backend/app/services/emails/import/google/, backend/app/services/calendars/import/google/ Build
Verified via code read: GOOGLE_SCOPES in google_account.rb lines 33-38 declares all four scopes (contacts.readonly, contacts.other.readonly, gmail.metadata, calendar.readonly). However, the frontend's useGoogleLogin hook (useIntegrationsPage.jsx:104) currently only requests the two contacts scopes โ€” the metadata/calendar scopes are declared server-side but never requested from the user. Scope UI toggles must update both the frontend request string and the backend's acceptance logic.

3.2 OAuth flow + scope management

The existing flow, visualized:

sequenceDiagram participant U as User participant AP as Admin Portal participant BE as Getro backend participant G as Google OAuth U->>AP: Click "Connect Google" AP->>G: useGoogleLogin popup (flow: auth-code) Note over AP,G: redirect_uri: postmessage
scope: contacts.readonly + other.readonly
[NEW: + gmail.metadata + calendar.readonly] G-->>U: Consent screen (shows per-scope) U->>G: Approve G-->>AP: authorization code AP->>BE: POST /user_google_accounts { code } BE->>G: exchange_code_for_token(code) G-->>BE: access_token + refresh_token + id_token BE->>BE: GoogleAccount.upsert
(scopes = returned granted_scopes) BE-->>AP: { id, syncStatus: syncing } AP->>BE: Poll GET /user_google_accounts every 3s BE-->>AP: syncStatus transitions to synced

Scope strings (canonical list)

# GoogleAccount::GOOGLE_SCOPES (verified in code)
contacts          = 'https://www.googleapis.com/auth/contacts.readonly'
other_contacts    = 'https://www.googleapis.com/auth/contacts.other.readonly'
gmail_metadata    = 'https://www.googleapis.com/auth/gmail.metadata'
calendar_readonly = 'https://www.googleapis.com/auth/calendar.readonly'

Admin portal change โ€” scope toggles

Current code:

// useIntegrationsPage.jsx:99-105 (current)
const googleLogin = useGoogleLogin({
  onSuccess, onError,
  flow: 'auth-code',
  redirect_uri: 'postmessage',
  scope: 'https://www.googleapis.com/auth/contacts.readonly https://www.googleapis.com/auth/contacts.other.readonly',
});

After (scope string built from user's checkbox state):

const buildScopeString = ({ contacts, gmailMetadata, calendar }) => [
  contacts       && 'https://www.googleapis.com/auth/contacts.readonly',
  contacts       && 'https://www.googleapis.com/auth/contacts.other.readonly',
  gmailMetadata  && 'https://www.googleapis.com/auth/gmail.metadata',
  calendar       && 'https://www.googleapis.com/auth/calendar.readonly',
].filter(Boolean).join(' ');

const googleLogin = useGoogleLogin({
  flow: 'auth-code',
  redirect_uri: 'postmessage',
  scope: buildScopeString(scopeChoices),
  // ...
});
Incremental consent. If the user previously connected with only contacts scopes and later enables gmail metadata, Google's consent flow prompts re-authorization. Handle this by setting include_granted_scopes: true โ€” Google merges previously-granted scopes with the new ones without forcing re-entry of everything.

Backend accepts granted scopes

The existing GoogleAccount.granted_scopes pattern (line 44-46) is already scope-aware. Sync jobs check has_gmail_metadata_scope? (line 56-58) before running. No change needed server-side except enabling the workers to run based on those flags.

Reference docs

3.3 Gmail metadata client

Extends lib/google/ with a new client class. Uses google-api-client (already in Gemfile).

What the gmail.metadata scope returns

Per the users.messages.list docs:

  • Allowed params: labelIds, maxResults (default 100, max 500), pageToken.
  • Blocked params: q (search query) and includeSpamTrash โ€” explicitly rejected when using metadata scope.
  • List response: only id + threadId per message. No headers.
  • For headers: follow-up messages.get(id, format: 'metadata') returns selected headers (From, To, Cc, Bcc, Date, Subject, Message-ID, In-Reply-To, References, List-Unsubscribe, Precedence).

Incremental sync via users.history.list

Per history.list docs:

  • Each sync run stores the returned historyId.
  • Next run passes startHistoryId; Gmail returns only records changed since.
  • Record types: messageAdded, messageDeleted, labelAdded, labelRemoved. For metadata ingestion we care about messageAdded.
  • Expiry: historyId is valid "at least a week" but may expire earlier. 404 means do a full sync.

Client skeleton

# backend/lib/google/gmail_metadata_client.rb
require 'google/apis/gmail_v1'
require 'signet/oauth_2/client'

module Google
  class GmailMetadataClient
    METADATA_HEADERS = %w[From To Cc Bcc Subject Date Message-ID In-Reply-To References
                          List-Unsubscribe Precedence Delivered-To Return-Path].freeze

    def list_messages(account:, page_token: nil, label_ids: ['INBOX', 'SENT'], page_size: 500)
      with_refresh(account: account) do
        gmail = build_service(account: account)
        result = gmail.list_user_messages(
          'me',
          label_ids: label_ids,
          max_results: page_size,
          page_token: page_token
        )
        { messages: result.messages.to_a, next_page_token: result.next_page_token,
          result_size_estimate: result.result_size_estimate }
      end
    end

    def get_message_metadata(account:, message_id:)
      with_refresh(account: account) do
        gmail = build_service(account: account)
        gmail.get_user_message(
          'me', message_id,
          format: 'metadata',
          metadata_headers: METADATA_HEADERS
        )
      end
    end

    def list_history(account:, start_history_id:, history_types: ['messageAdded'])
      with_refresh(account: account) do
        gmail = build_service(account: account)
        result = gmail.list_user_histories(
          'me',
          start_history_id: start_history_id,
          history_types: history_types,
          max_results: 500
        )
        { history: result.history.to_a, next_page_token: result.next_page_token,
          history_id: result.history_id }
      end
    rescue Google::Apis::ClientError => e
      raise HistoryIdExpiredError if e.status_code.to_i == 404
      raise
    end

    class HistoryIdExpiredError < StandardError; end

    private

    def build_service(account:)
      gmail = ::Google::Apis::GmailV1::GmailService.new
      gmail.client_options.open_timeout_sec = 5
      gmail.client_options.read_timeout_sec = 30
      gmail.request_options.retries = 0
      gmail.authorization = authorization_for(account: account)
      gmail
    end

    def authorization_for(account:)
      Signet::OAuth2::Client.new(
        token_credential_uri: 'https://oauth2.googleapis.com/token',
        client_id: ENV['GOOGLE_CLIENT_ID'],
        client_secret: ENV['GOOGLE_CLIENT_SECRET'],
        scope: ['https://www.googleapis.com/auth/gmail.metadata'],
        access_token: account.oauth_access_token,
        refresh_token: account.oauth_refresh_token
      )
    end

    def with_refresh(account:)
      Circuitbox.circuit(:google_gmail_metadata, circuit_options).run do
        tried_refresh = false
        begin
          yield
        rescue ::Google::Apis::AuthorizationError, ::Google::Apis::ClientError => e
          if !tried_refresh && auth_error?(e) && refresh!(account: account)
            tried_refresh = true
            retry
          end
          raise
        end
      end
    end

    # auth_error?, refresh!, circuit_options โ€” mirror people_client.rb exactly
  end
end

Syncer skeleton (mirrors ContactsSyncer)

# backend/app/services/emails/import/google/gmail_metadata_syncer.rb
module Emails
  module Import
    module Google
      class GmailMetadataSyncer < ApplicationService
        def initialize(account:, force_full: false)
          @account = account
          @client = ::Google::GmailMetadataClient.new
          @state = ::GmailMetadataSyncState.find_or_create_by!(google_account: account)
          @force_full = force_full
        end

        def call
          return failure(error: 'missing scope') unless account.has_gmail_metadata_scope?

          if force_full || state.history_id.blank?
            full_sync!
          else
            incremental_sync!
          end

          state.update!(last_synced_at: Time.current, last_status: :success, last_error: nil)
          success
        rescue ::Google::GmailMetadataClient::HistoryIdExpiredError
          state.update!(history_id: nil)
          full_sync!
          success
        rescue => e
          state.update!(last_status: :error, last_error: e.message)
          raise
        end

        private

        def full_sync!
          page_token = nil
          latest_history_id = nil
          loop do
            resp = client.list_messages(account:, page_token: page_token)
            ingest(resp[:messages])
            latest_history_id ||= resp[:messages].first&.history_id
            page_token = resp[:next_page_token]
            break if page_token.blank?
          end
          state.update!(history_id: latest_history_id) if latest_history_id
        end

        def incremental_sync!
          page_token = nil
          loop do
            resp = client.list_history(account:, start_history_id: state.history_id)
            ingest_history_records(resp[:history])
            page_token = resp[:next_page_token]
            state.update!(history_id: resp[:history_id]) if resp[:history_id]
            break if page_token.blank?
          end
        end

        def ingest(message_refs)
          message_refs.each do |ref|
            msg = client.get_message_metadata(account:, message_id: ref.id)
            InteractionEvent.upsert(to_event_row(msg), unique_by: :remote_message_id)
          end
        end

        def to_event_row(msg)
          # Parse headers, detect direction, flag newsletter, normalize timestamp.
          # See Newsletter detection section.
        end

        # ingest_history_records: iterate messageAdded entries, fetch metadata, upsert
      end
    end
  end
end

Newsletter detection (runs at ingestion)

def newsletter?(msg)
  headers = msg.payload.headers.to_h { |h| [h.name.downcase, h.value] }

  return true if headers['list-unsubscribe'].present?
  return true if headers['precedence']&.match?(/bulk|list|junk/i)
  recipient_count = %w[to cc bcc].sum { |h| (headers[h] || '').split(',').size }
  return true if recipient_count > 20

  false
end

Rate-limit posture

Gmail API quota: 1 billion units/day/project, 250 units/user/second. messages.list costs 5 units, messages.get costs 5 units. A 24-month backfill of 20,000 messages = ~200k units (โ‰ˆ800 seconds if serialized, so parallelize with care). Rate limiting falls out of the CircuitBox quota + Sidekiq concurrency cap.

3.4 Google Calendar metadata client

Simpler than Gmail โ€” one endpoint, one sync token, one response shape.

What calendar.readonly returns

Per events.list docs:

  • Incremental: syncToken (opaque), 410 on expiry โ†’ full sync.
  • Expansion: singleEvents=true expands recurring events into instances. Pair with orderBy: 'startTime'.
  • Attendees: array with email, displayName, responseStatus (needsAction, declined, tentative, accepted).
  • Pagination: maxResults default 250, max 2500. nextPageToken.

Client skeleton

# backend/lib/google/calendar_client.rb
require 'google/apis/calendar_v3'

module Google
  class CalendarClient
    def list_events(account:, calendar_id: 'primary', sync_token: nil, page_token: nil,
                    time_min: nil, max_results: 2500)
      with_refresh(account: account) do
        service = build_service(account: account)
        service.list_events(
          calendar_id,
          sync_token: sync_token,
          page_token: page_token,
          max_results: max_results,
          single_events: true,
          order_by: sync_token.blank? ? 'startTime' : nil,   # orderBy forbidden with sync_token
          time_min: time_min,                                # only used for first full sync
          show_deleted: true
        )
      end
    end
    # build_service, with_refresh identical pattern to gmail client
  end
end

Syncer loop (410 Gone recovery)

Mirrors ContactsSyncer's existing expired_sync_token_error? pattern (contacts_syncer.rb:159):

def fetch_page(page_token:, sync_token:)
  client.list_events(account:, page_token:, sync_token:)
rescue ::Google::Apis::ClientError => e
  raise unless e.status_code.to_i == 410
  state.update!(sync_token: nil)   # trigger full resync on next call
  client.list_events(account:, time_min: 24.months.ago.iso8601)
end

Backfill horizon

First full sync passes time_min: 24.months.ago.iso8601. Per DR-08 in the technical spec. Nightly sync uses the stored syncToken.

3.5 Shared mailbox options on Google

Three flavors of "shared" exist. None require new client code beyond variations on delegation.

Flavor How users access it Visible via user's Gmail API grant? Complexity
Google Group / distribution list
(e.g. deals@inovia.ca as a group)
Inbound fans out to each member's personal inbox YES โ€” free, no config None. Already covered by personal-mailbox sync.
Gmail delegation
(user-level, via Settings > Accounts)
Delegate sees the mailbox as a separate view in Gmail UI NO. Gmail API does not expose delegated mailboxes through the delegate's OAuth token. High โ€” requires workspace admin-granted domain-wide delegation via a service account.
Domain-wide delegation
(service-account impersonation)
Admin grants service account the right to impersonate any user in the domain YES, per-user โ€” service account acquires a token for any user email in the domain High. Requires admin consent. Service account credentials stored in Getro. Per-mailbox sync.
Recommendation: Google Groups covers 90% of "shared inbox" patterns for VC firms. If Inovia uses deals@ as a group, we're done. Domain-wide delegation is the "escape hatch" for true shared mailboxes but is expensive โ€” it means asking IT to grant service-account scopes at the domain level, and Getro would need to store separate sync state per (service-account, impersonated-email). Defer unless a specific customer demands it.

Reference docs

MS

4. Microsoft 365

Greenfield ยท 2 new clients + OAuth + admin-portal UI

4.1 Azure AD multi-tenant app registration

This is the prerequisite that has to happen first and is partly ops work. The Azure app is where Getro registers as a third-party consumer of Microsoft 365 data. It has to be multi-tenant so a user at any Microsoft 365 tenant can consent.

Step-by-step

  1. Go to Microsoft Entra admin center, App registrations โ†’ New registration.
  2. Name: Getro Production (and Getro Sandbox for non-prod).
  3. Supported account types: "Accounts in any organizational directory (any Microsoft Entra ID tenant โ€” Multitenant)" โ†’ signInAudience = AzureADMultipleOrgs.
  4. Redirect URI (Web): https://api.getro.com/auth/microsoft/callback (plus staging + dev variants).
  5. Under Certificates & secrets โ†’ add a client secret (store in Getro's env vars).
  6. Under API permissions โ†’ Add a permission โ†’ Microsoft Graph โ†’ Delegated permissions:
    • offline_access
    • User.Read
    • Mail.ReadBasic
    • Calendars.ReadBasic
    • Contacts.Read (optional โ€” only if we sync MS contacts too)
    • Mail.Read.Shared (optional โ€” shared mailbox support)
    • Calendars.Read.Shared (optional โ€” shared calendar support)
  7. Under Authentication โ†’ enable Access tokens + ID tokens.
  8. Expose tenant ID = common (or organizations for enterprise-only).
Tenant ID: common vs organizations. common accepts personal Microsoft accounts (Xbox, Hotmail) plus work/school. organizations is work/school only. Given Inovia-style targets are enterprise, organizations is the safer default โ€” prevents a user from connecting a personal @outlook.com mailbox and polluting signals.

Admin consent

For Mail.ReadBasic, Calendars.ReadBasic, Mail.Read.Shared: individual user consent suffices. No admin consent required. (Verified in MS Graph permissions reference.)

Some enterprise tenants disable user-level consent globally via IT policy. In those cases the user sees "Need admin approval" and we show a "Request approval from your IT admin" CTA in the admin portal. Microsoft provides a standard admin-consent URL pattern: https://login.microsoftonline.com/{tenant}/adminconsent?client_id={app}&redirect_uri={cb}.

Reference docs

4.2 MicrosoftAccount model

Mirror of GoogleAccount. Same encryption pattern, same scopes-as-JSONB shape.

# db/migrate/YYYYMMDDHHMMSS_create_microsoft_accounts.rb
class CreateMicrosoftAccounts < ActiveRecord::Migration[7.2]
  def change
    create_table :microsoft_accounts do |t|
      t.references :user, null: false, foreign_key: true
      t.string :email, null: false
      t.string :microsoft_uid, null: false   # Graph's id for the user
      t.string :tenant_id                    # tenant of the authenticating user
      t.text   :oauth_access_token
      t.text   :oauth_refresh_token
      t.datetime :revoked_at
      t.jsonb  :scopes
      t.integer :status, default: 0, null: false
      t.timestamps
    end
    add_index :microsoft_accounts, [:user_id, :microsoft_uid], unique: true
  end
end

# app/models/microsoft_account.rb
class MicrosoftAccount < ApplicationRecord
  belongs_to :user
  has_one :ms_mail_sync_state, dependent: :destroy
  has_one :ms_calendar_sync_state, dependent: :destroy

  encrypts :oauth_access_token
  encrypts :oauth_refresh_token

  MS_SCOPES = {
    mail_read_basic:      'Mail.ReadBasic',
    calendars_read_basic: 'Calendars.ReadBasic',
    mail_read_shared:     'Mail.Read.Shared',
    calendars_read_shared:'Calendars.Read.Shared',
    offline_access:       'offline_access',
    user_read:            'User.Read'
  }.freeze

  def has_refresh_token?       = oauth_refresh_token.present?
  def has_mail_scope?          = granted_scopes.include?('Mail.ReadBasic')
  def has_calendar_scope?      = granted_scopes.include?('Calendars.ReadBasic')
  def has_shared_mail_scope?   = granted_scopes.include?('Mail.Read.Shared')
  def has_shared_calendar_scope? = granted_scopes.include?('Calendars.Read.Shared')
  def granted_scopes           = Array(scopes)
end

4.3 OAuth flow + gem choice

No Microsoft gems are in Gemfile today. Three gem options ranked by fit:

OptionWhy pickWhy not
Faraday + manual OAuth (recommended) Consistent with lib/google/oauth_client.rb pattern. Zero new gems. Full control. We write token exchange + refresh ourselves (~80 LOC).
omniauth-azure-activedirectory-v2 Composable with existing OmniAuth infrastructure (omniauth is in Gemfile). Adds a gem for ~what Faraday does inline. Overkill for one integration.
msal-ruby (community port) Mirrors Microsoft's official MSAL libs in other langs. Low-maintenance status on GitHub; may lag Graph updates. Adds runtime deps.

Recommend Faraday manual โ€” matches how lib/google/oauth_client.rb was built.

Client skeleton

# backend/lib/microsoft/oauth_client.rb
module Microsoft
  class OauthClient
    AUTHORITY = 'https://login.microsoftonline.com'
    TENANT = 'organizations'   # see DR in technical spec

    def exchange_code_for_token(code:, redirect_uri:)
      conn = Faraday.new(url: "#{AUTHORITY}/#{TENANT}") { |f| f.request :url_encoded; f.adapter Faraday.default_adapter }
      resp = conn.post('/oauth2/v2.0/token',
        code: code,
        client_id: ENV['MICROSOFT_CLIENT_ID'],
        client_secret: ENV['MICROSOFT_CLIENT_SECRET'],
        redirect_uri: redirect_uri,
        grant_type: 'authorization_code',
        scope: 'offline_access openid email User.Read Mail.ReadBasic Calendars.ReadBasic'
      )
      resp.status == 200 ? JSON.parse(resp.body) : nil
    end

    def refresh_access_token(refresh_token:)
      conn = Faraday.new(url: "#{AUTHORITY}/#{TENANT}") { |f| f.request :url_encoded; f.adapter Faraday.default_adapter }
      resp = conn.post('/oauth2/v2.0/token',
        refresh_token: refresh_token,
        client_id: ENV['MICROSOFT_CLIENT_ID'],
        client_secret: ENV['MICROSOFT_CLIENT_SECRET'],
        grant_type: 'refresh_token',
        scope: 'offline_access openid email User.Read Mail.ReadBasic Calendars.ReadBasic'
      )
      resp.status == 200 ? JSON.parse(resp.body) : nil
    end
  end
end

Admin-portal change

Mirror userGoogleAccountsV2.js:

// admin-portal/src/services/userMicrosoftAccountsV2.js
export const userMicrosoftAccountsApi = createApi({
  reducerPath: 'userMicrosoftAccountsApi',
  baseQuery,
  tagTypes: ['microsoft-accounts'],
  endpoints: (builder) => ({
    getUserMicrosoftAccounts: builder.query({ /* GET */ }),
    createUserMicrosoftAccount: builder.mutation({ /* POST { code, redirect_uri } */ }),
    deleteUserMicrosoftAccount: builder.mutation({ /* DELETE */ }),
  }),
});

Since there's no @react-oauth/microsoft equivalent as clean as Google's, a hand-rolled popup flow is cleanest. Or use Microsoft's MSAL.js:

import { PublicClientApplication } from '@azure/msal-browser';

const msal = new PublicClientApplication({
  auth: {
    clientId: import.meta.env.VITE_MS_CLIENT_ID,
    authority: 'https://login.microsoftonline.com/organizations',
    redirectUri: window.location.origin + '/auth/microsoft/callback',
  },
});

const handleMicrosoftConnect = async (scopeChoices) => {
  const scopes = ['offline_access', 'User.Read']
    .concat(scopeChoices.mail ? ['Mail.ReadBasic'] : [])
    .concat(scopeChoices.calendar ? ['Calendars.ReadBasic'] : [])
    .concat(scopeChoices.sharedMail ? ['Mail.Read.Shared'] : [])
    .concat(scopeChoices.sharedCalendar ? ['Calendars.Read.Shared'] : []);

  const result = await msal.acquireTokenPopup({ scopes });
  await createMicrosoftAccount({ code: result.code, redirectUri: msal.config.auth.redirectUri }).unwrap();
};

4.4 Mail.ReadBasic client

Per user-list-messages and delta query messages:

  • Endpoint: GET /me/messages for full listing. GET /me/mailFolders/{id}/messages/delta for incremental (per-folder โ€” typically Inbox + Sent Items).
  • Projection: $select=id,conversationId,from,toRecipients,ccRecipients,bccRecipients,sentDateTime,receivedDateTime,subject,internetMessageId,isRead,internetMessageHeaders
  • Pagination: Follow @odata.nextLink verbatim โ€” do not parse $skip.
  • Paging size: Prefer: odata.maxpagesize=100 (max 1000).
  • Delta lifecycle: GET returns @odata.nextLink (continue) or @odata.deltaLink (round complete, save for next sync). deltaLink expires after ~30 days; expired token returns 410.
  • Deleted messages: in delta response as { id, "@removed": { "reason": "deleted" } }.

Client skeleton

# backend/lib/microsoft/mail_client.rb
module Microsoft
  class MailClient
    GRAPH = 'https://graph.microsoft.com/v1.0'
    SELECT = %w[id conversationId from toRecipients ccRecipients bccRecipients
                sentDateTime receivedDateTime subject internetMessageId isRead
                internetMessageHeaders].join(',')

    def list_messages_delta(account:, folder_id: 'Inbox', delta_link: nil, page_url: nil)
      with_refresh(account: account) do
        url = page_url || delta_link || "#{GRAPH}/me/mailFolders/#{folder_id}/messages/delta?$select=#{SELECT}"
        resp = faraday(account: account).get(url) do |req|
          req.headers['Prefer'] = 'odata.maxpagesize=100'
        end
        raise_for_status(resp)
        json = JSON.parse(resp.body)
        {
          messages:    json['value'],
          next_link:   json['@odata.nextLink'],
          delta_link:  json['@odata.deltaLink']
        }
      end
    end

    private

    def faraday(account:)
      Faraday.new do |f|
        f.headers['Authorization'] = "Bearer #{account.oauth_access_token}"
        f.headers['Accept'] = 'application/json'
        f.adapter Faraday.default_adapter
      end
    end

    def raise_for_status(resp)
      case resp.status
      when 200..299 then return
      when 401 then raise Microsoft::AuthError
      when 410 then raise Microsoft::DeltaTokenExpired
      when 429 then raise Microsoft::Throttled.new(retry_after: resp.headers['Retry-After'].to_i)
      else raise Microsoft::ApiError.new("#{resp.status}: #{resp.body}")
      end
    end

    # with_refresh: same pattern as Google โ€” refresh on 401, wrap in CircuitBox
  end
end

Throttling

Per MS Graph throttling docs: 429 responses include Retry-After (seconds). Respect it. Our Microsoft::Throttled exception carries the value; the syncer catches and schedules a delayed retry via Sidekiq's built-in sidekiq_retry_in.

Syncer skeleton (delta loop)

module Emails
  module Import
    module Microsoft
      class MailSyncer < ApplicationService
        FOLDERS = %w[Inbox SentItems].freeze

        def initialize(account:)
          @account = account
          @client = ::Microsoft::MailClient.new
        end

        def call
          FOLDERS.each { |folder| sync_folder(folder) }
          success
        end

        private

        def sync_folder(folder_id)
          state = MsMailSyncState.find_or_create_by!(microsoft_account: account, folder_id: folder_id)
          page_url = nil
          delta_link = state.delta_link

          loop do
            resp = @client.list_messages_delta(account:, folder_id: folder_id, delta_link: delta_link, page_url: page_url)
            ingest(resp[:messages])

            if resp[:next_link]
              page_url   = resp[:next_link]
              delta_link = nil
            elsif resp[:delta_link]
              state.update!(delta_link: resp[:delta_link])
              break
            else
              break
            end
          end
        rescue ::Microsoft::DeltaTokenExpired
          state.update!(delta_link: nil)
          retry
        rescue ::Microsoft::Throttled => e
          raise Sidekiq::Throttled::Error.new(retry_in: e.retry_after)
        end
      end
    end
  end
end

4.5 Calendars.ReadBasic client

Per user-list-events:

  • Endpoint: GET /me/events for listing. Delta: GET /me/calendarView/delta (note: events.delta at the calendar level, not /me/events/delta).
  • Fields: id, subject, organizer, attendees[], start, end, isAllDay, isCancelled.
  • Attendee responseStatus: none, organizer, tentativelyAccepted, accepted, declined, notResponded.
  • Timezone: send Prefer: outlook.timezone="UTC" on every request โ€” avoid per-user timezone drift.
  • Projection: $select=id,subject,organizer,attendees,start,end,isCancelled. Note Calendars.ReadBasic already excludes body/attachments so $select is optimization not privacy.

Attendee filtering

For scoring, we count attendees where responseStatus.response != 'declined'. We skip cancelled events entirely (isCancelled: true).

Client skeleton

# backend/lib/microsoft/calendar_client.rb
module Microsoft
  class CalendarClient
    GRAPH = 'https://graph.microsoft.com/v1.0'
    SELECT = %w[id subject organizer attendees start end isAllDay isCancelled].join(',')

    def list_events_delta(account:, delta_link: nil, page_url: nil, start_date_time: nil, end_date_time: nil)
      url = page_url || delta_link || build_initial_url(start_date_time, end_date_time)
      with_refresh(account: account) do
        resp = faraday(account: account).get(url) do |req|
          req.headers['Prefer'] = 'outlook.timezone="UTC", odata.maxpagesize=100'
        end
        raise_for_status(resp)
        parse_response(resp)
      end
    end

    private

    def build_initial_url(start_dt, end_dt)
      start_dt ||= 24.months.ago.iso8601
      end_dt   ||= 1.year.from_now.iso8601
      "#{GRAPH}/me/calendarView/delta?startDateTime=#{CGI.escape(start_dt)}&endDateTime=#{CGI.escape(end_dt)}&$select=#{SELECT}"
    end
  end
end

4.6 Shared mailbox โ€” the good news

Per the MS Graph permissions reference: Mail.Read.Shared is a delegated permission that does NOT require admin consent. Same for Calendars.Read.Shared.

How it works

  1. Inovia's IT admin grants Alice "Full Access" permission to deals@inovia.ca via Exchange admin center (their side โ€” we have no control over this).
  2. Alice connects her Microsoft 365 to Getro, granting Mail.Read.Shared.
  3. Getro calls GET /users/deals@inovia.ca/messages using Alice's token โ€” Graph sees that Alice has the underlying Exchange permission and serves the messages.

Discovering shared mailboxes

No single Graph endpoint lists "shared mailboxes the signed-in user has access to." Practical pattern:

  • Ask Alice in the admin-portal UI: "List any shared mailboxes you want to sync (email addresses)."
  • Store each as a SharedMailbox row linked to the MicrosoftAccount.
  • Sync each independently.

Syncer pattern

class SharedMailboxSyncer < ApplicationService
  def initialize(microsoft_account:, shared_mailbox:)
    @account = microsoft_account
    @mailbox = shared_mailbox   # e.g. deals@inovia.ca
    @client = Microsoft::MailClient.new
  end

  def call
    # Instead of /me/messages, hit /users/{mailboxId}/messages
    @client.list_messages_for_user(
      account: @account,
      user_id: @mailbox.email_address
    )
    # ... ingest, same as personal
  end
end

Attribution gotcha

When a founder emails deals@inovia.ca, the InteractionEvent needs a user_id. Options:

  • Per-subscriber attribution: create one event per team member with shared-mailbox access. 5 partners ร— 1 email = 5 events. Strength signal accrues to all 5.
  • Single-event + multi-attribution column: one event, user_ids as an array. Requires schema change.

The first option is simpler and matches how group-mailbox signals already accrue. Recommend it.

Microsoft makes shared mailboxes cheap โ€” one extra scope, no admin consent, straightforward endpoint swap. Google makes it expensive (domain-wide delegation, admin consent). If shared mailboxes are a must-have, Microsoft users get the feature first; Google users get "use a Google Group" as the workaround.

5. Shared-mailbox decision matrix

Consolidated trade-off for the team's call on whether to include V1.

Flavor Provider Engineering effort Customer setup Admin consent V1 recommendation
Google Group Google None (already covered) None None โœ… Ship
True Gmail shared mailbox (delegation) Google High โ€” service account + domain-wide delegation Admin grants DWD to service account Required Defer
M365 shared mailbox Microsoft Medium โ€” one extra scope + sync-per-mailbox worker User lists mailboxes in UI Not required โœ… Ship
M365 Group Microsoft None (already covered) None None โœ… Ship

Net V1 position

Three of four shared-mailbox flavors ship free or near-free. Only Gmail delegation deserves deferral. This gives the team a concrete "complexity vs UX" answer: ship Google Groups + all M365 flavors; defer Gmail delegation unless a specific customer demands it.

6. Cross-provider concerns

Self-emails / team-member-as-contact

Emailing a colleague shouldn't make them look like a Contact. Before inserting InteractionEvent:

return if User.where(email: contact_email, collection_id: account.collections.select(:id)).exists?

Same check for both providers. The resolver never attempts to match internal team emails to Contacts.

BCC semantics

Each provider exposes BCC only on the sender's own messages:

  • Gmail: BCC header appears on messages where the user is the From, never on received messages.
  • MS Graph: bccRecipients is returned only when the signed-in user is the sender.

When bccRecipients is present on an outbound, attribute interaction to each BCC recipient (same as To + Cc).

Timezone handling

  • Gmail: internalDate is always Unix epoch ms UTC. No ambiguity.
  • Google Calendar: start.dateTime + start.timeZone. Always normalize to UTC before storing.
  • MS Graph Mail: sentDateTime / receivedDateTime are ISO-8601 UTC. Store as-is.
  • MS Graph Calendar: start is a dateTimeTimeZone object. Send Prefer: outlook.timezone="UTC" on every request to force UTC responses.

Newsletter / bulk inbound filter

Header inspection works on both providers; the header names are RFC standards, not provider-specific:

  • Gmail: call messages.get(format: 'metadata') with metadata_headers: ['List-Unsubscribe', 'Precedence'].
  • MS Graph: internetMessageHeaders field in the $select projection.

Apply identical logic (see Gmail section for detection predicate).

Attribution coverage

Independent of provider, coverage depends on Findem's ability to resolve the correspondent email to a canonical identity. Both providers produce the same email addresses; the dedup pipeline treats them identically.

7. Operational envelope (per-provider)

Google โ€” Gmail

Scope
gmail.metadata โœ… approved
Daily quota
1B units / project
Per-user
250 units/sec
Incremental
history.list, โ‰ฅ1 week valid
List cost
5 units per messages.list, 5 per messages.get

Google โ€” Calendar

Scope
calendar.readonly (declared)
Incremental
syncToken, 410 on expiry
Page size
default 250, max 2500
Backfill horizon
24 months via timeMin

Microsoft โ€” Mail

Scope
Mail.ReadBasic
Admin consent
Not required
Incremental
Per-folder delta, @odata.deltaLink
Page size
default 10, max 1000; use Prefer: odata.maxpagesize=100
Throttling
429 + Retry-After

Microsoft โ€” Calendar

Scope
Calendars.ReadBasic
Admin consent
Not required
Incremental
/me/calendarView/delta
Timezone
Prefer: outlook.timezone="UTC"

Microsoft โ€” shared mailbox

Scope
Mail.Read.Shared
Admin consent
Not required
Endpoint
/users/{upn}/messages
Prerequisite
User has FullAccess in Exchange

Google โ€” shared via DWD

Flow
Service account + domain-wide delegation
Admin consent
Required (per-domain)
V1 status
Deferred (see matrix)

8. Decisions to confirm before execution

  1. MS tenant audience: organizations (work/school only) vs. common (also personal MS accounts). Recommend organizations โ€” enterprise-only prevents personal-account pollution.
  2. Shared mailbox scope in V1: ship Google Groups + M365 shared (both low-cost), defer Gmail delegation. Confirm.
  3. Backfill horizon: 24 months on first connect (per DR-08). Confirm for both providers.
  4. MS OAuth gem choice: Faraday manual (recommended) vs. omniauth-azure-activedirectory-v2. Affects ~80 LOC.
  5. Gmail incremental strategy: history.list from day 1 (recommended โ€” matches MS delta pattern) vs. always full list + diff. History API is more efficient.
  6. Personal MS account blocking: if we go with organizations, we need UI copy for users who try to connect user@outlook.com and hit an error.
  7. MS rate-limit handling in Sidekiq: inline retry via sidekiq_retry_in vs. dedicated throttle queue. Inline is simpler.

Next step

Once these decisions are made, the phased plan in the technical spec is ready to convert into JIRA tickets. No external blockers remain except the Azure tenant verification.