Gmail, Outlook, Google Calendar, Outlook Calendar
Concrete plan per provider โ what exists, what extends, what's greenfield, with code skeletons and official doc links. Enough for the team to scope execution and choose complexity vs. UX on shared mailboxes.
1. Overview
Two providers, four metadata sources, one normalized pipeline. The Getro backend already has a live Google Contacts sync we extend; Microsoft 365 is greenfield.
| Dimension | Google Workspace | Microsoft 365 |
|---|---|---|
| OAuth infrastructure | Live ยท Signet + lockbox tokens in GoogleAccount |
Build ยท Mirror GoogleAccount as MicrosoftAccount |
| Scope declaration | Live ยท gmail.metadata + calendar.readonly already in GOOGLE_SCOPES |
Build ยท Register scopes in Azure app manifest |
| Admin portal UI | Partial ยท Google card exists, needs per-scope toggles | Build ยท Mirror Google card + new RTK Query service |
| Email sync | Build ยท users.messages.list + history.list for incremental |
Build ยท /me/messages + /me/messages/delta |
| Calendar sync | Build ยท events.list with syncToken |
Build ยท /me/events with delta query |
| Shared mailbox | Google Groups โ free, delegation โ ๏ธ complex | Mail.Read.Shared โ
no admin consent |
| Rate limits | 1B units/day/project ยท 250/user/s | Varies per service ยท throttled via 429 + Retry-After |
| Metadata scope approval | Approved โ (per product) | N/A โ Mail.ReadBasic does not require verification |
3. Google Workspace
Extend live integration ยท 3 new clients3.1 Current state (verified in code)
| Component | Path | Status |
|---|---|---|
| OAuth client | backend/lib/google/oauth_client.rb |
Live |
| GoogleAccount model | backend/app/models/google_account.rb |
Live ยท encrypted tokens, scopes JSONB |
| People API client (contacts sync) | backend/app/services/google/people_client.rb |
Live ยท token-refresh-on-401 pattern |
| Contacts syncer service | backend/app/services/contacts/import/google/contacts_syncer.rb |
Live ยท sync-token pagination + 410 fallback |
| Daily scheduler | backend/app/workers/schedulers/contacts/import/google_contacts_daily_sync_scheduler.rb |
Live ยท 6am UTC cron |
| Admin-portal card + hook | admin-portal/src/pages/Settings/integrations/hooks/useIntegrationsPage.jsx |
Live ยท @react-oauth/google popup flow, 3s polling |
| RTK Query service | admin-portal/src/services/userGoogleAccountsV2.js |
Live ยท GET/POST/DELETE on /user_google_accounts |
| CircuitBox wrap on Google | โ | Build ยท gap, retrofit all clients |
| Scope toggles in UI | โ | Build ยท per-source checkboxes + updated useGoogleLogin scope string |
| Gmail metadata client | backend/lib/google/gmail_metadata_client.rb |
Build |
| Calendar client | backend/lib/google/calendar_client.rb |
Build |
| Gmail / Calendar syncers + jobs + schedulers | backend/app/services/emails/import/google/, backend/app/services/calendars/import/google/ |
Build |
GOOGLE_SCOPES in google_account.rb lines 33-38 declares all four scopes (contacts.readonly, contacts.other.readonly, gmail.metadata, calendar.readonly). However, the frontend's useGoogleLogin hook (useIntegrationsPage.jsx:104) currently only requests the two contacts scopes โ the metadata/calendar scopes are declared server-side but never requested from the user. Scope UI toggles must update both the frontend request string and the backend's acceptance logic.3.2 OAuth flow + scope management
The existing flow, visualized:
scope: contacts.readonly + other.readonly
[NEW: + gmail.metadata + calendar.readonly] G-->>U: Consent screen (shows per-scope) U->>G: Approve G-->>AP: authorization code AP->>BE: POST /user_google_accounts { code } BE->>G: exchange_code_for_token(code) G-->>BE: access_token + refresh_token + id_token BE->>BE: GoogleAccount.upsert
(scopes = returned granted_scopes) BE-->>AP: { id, syncStatus: syncing } AP->>BE: Poll GET /user_google_accounts every 3s BE-->>AP: syncStatus transitions to synced
Scope strings (canonical list)
# GoogleAccount::GOOGLE_SCOPES (verified in code)
contacts = 'https://www.googleapis.com/auth/contacts.readonly'
other_contacts = 'https://www.googleapis.com/auth/contacts.other.readonly'
gmail_metadata = 'https://www.googleapis.com/auth/gmail.metadata'
calendar_readonly = 'https://www.googleapis.com/auth/calendar.readonly'
Admin portal change โ scope toggles
Current code:
// useIntegrationsPage.jsx:99-105 (current)
const googleLogin = useGoogleLogin({
onSuccess, onError,
flow: 'auth-code',
redirect_uri: 'postmessage',
scope: 'https://www.googleapis.com/auth/contacts.readonly https://www.googleapis.com/auth/contacts.other.readonly',
});
After (scope string built from user's checkbox state):
const buildScopeString = ({ contacts, gmailMetadata, calendar }) => [
contacts && 'https://www.googleapis.com/auth/contacts.readonly',
contacts && 'https://www.googleapis.com/auth/contacts.other.readonly',
gmailMetadata && 'https://www.googleapis.com/auth/gmail.metadata',
calendar && 'https://www.googleapis.com/auth/calendar.readonly',
].filter(Boolean).join(' ');
const googleLogin = useGoogleLogin({
flow: 'auth-code',
redirect_uri: 'postmessage',
scope: buildScopeString(scopeChoices),
// ...
});
include_granted_scopes: true โ Google merges previously-granted scopes with the new ones without forcing re-entry of everything.Backend accepts granted scopes
The existing GoogleAccount.granted_scopes pattern (line 44-46) is already scope-aware. Sync jobs check has_gmail_metadata_scope? (line 56-58) before running. No change needed server-side except enabling the workers to run based on those flags.
Reference docs
- OAuth 2.0 for Web Server Applications
- OAuth 2.0 scopes for Google APIs
- Gmail API OAuth scopes โ
gmail.metadatarestricted scope specifics
3.3 Gmail metadata client
Extends lib/google/ with a new client class. Uses google-api-client (already in Gemfile).
What the gmail.metadata scope returns
Per the users.messages.list docs:
- Allowed params:
labelIds,maxResults(default 100, max 500),pageToken. - Blocked params:
q(search query) andincludeSpamTrashโ explicitly rejected when using metadata scope. - List response: only
id+threadIdper message. No headers. - For headers: follow-up
messages.get(id, format: 'metadata')returns selected headers (From, To, Cc, Bcc, Date, Subject, Message-ID, In-Reply-To, References, List-Unsubscribe, Precedence).
Incremental sync via users.history.list
Per history.list docs:
- Each sync run stores the returned
historyId. - Next run passes
startHistoryId; Gmail returns only records changed since. - Record types:
messageAdded,messageDeleted,labelAdded,labelRemoved. For metadata ingestion we care aboutmessageAdded. - Expiry:
historyIdis valid "at least a week" but may expire earlier. 404 means do a full sync.
Client skeleton
# backend/lib/google/gmail_metadata_client.rb
require 'google/apis/gmail_v1'
require 'signet/oauth_2/client'
module Google
class GmailMetadataClient
METADATA_HEADERS = %w[From To Cc Bcc Subject Date Message-ID In-Reply-To References
List-Unsubscribe Precedence Delivered-To Return-Path].freeze
def list_messages(account:, page_token: nil, label_ids: ['INBOX', 'SENT'], page_size: 500)
with_refresh(account: account) do
gmail = build_service(account: account)
result = gmail.list_user_messages(
'me',
label_ids: label_ids,
max_results: page_size,
page_token: page_token
)
{ messages: result.messages.to_a, next_page_token: result.next_page_token,
result_size_estimate: result.result_size_estimate }
end
end
def get_message_metadata(account:, message_id:)
with_refresh(account: account) do
gmail = build_service(account: account)
gmail.get_user_message(
'me', message_id,
format: 'metadata',
metadata_headers: METADATA_HEADERS
)
end
end
def list_history(account:, start_history_id:, history_types: ['messageAdded'])
with_refresh(account: account) do
gmail = build_service(account: account)
result = gmail.list_user_histories(
'me',
start_history_id: start_history_id,
history_types: history_types,
max_results: 500
)
{ history: result.history.to_a, next_page_token: result.next_page_token,
history_id: result.history_id }
end
rescue Google::Apis::ClientError => e
raise HistoryIdExpiredError if e.status_code.to_i == 404
raise
end
class HistoryIdExpiredError < StandardError; end
private
def build_service(account:)
gmail = ::Google::Apis::GmailV1::GmailService.new
gmail.client_options.open_timeout_sec = 5
gmail.client_options.read_timeout_sec = 30
gmail.request_options.retries = 0
gmail.authorization = authorization_for(account: account)
gmail
end
def authorization_for(account:)
Signet::OAuth2::Client.new(
token_credential_uri: 'https://oauth2.googleapis.com/token',
client_id: ENV['GOOGLE_CLIENT_ID'],
client_secret: ENV['GOOGLE_CLIENT_SECRET'],
scope: ['https://www.googleapis.com/auth/gmail.metadata'],
access_token: account.oauth_access_token,
refresh_token: account.oauth_refresh_token
)
end
def with_refresh(account:)
Circuitbox.circuit(:google_gmail_metadata, circuit_options).run do
tried_refresh = false
begin
yield
rescue ::Google::Apis::AuthorizationError, ::Google::Apis::ClientError => e
if !tried_refresh && auth_error?(e) && refresh!(account: account)
tried_refresh = true
retry
end
raise
end
end
end
# auth_error?, refresh!, circuit_options โ mirror people_client.rb exactly
end
end
Syncer skeleton (mirrors ContactsSyncer)
# backend/app/services/emails/import/google/gmail_metadata_syncer.rb
module Emails
module Import
module Google
class GmailMetadataSyncer < ApplicationService
def initialize(account:, force_full: false)
@account = account
@client = ::Google::GmailMetadataClient.new
@state = ::GmailMetadataSyncState.find_or_create_by!(google_account: account)
@force_full = force_full
end
def call
return failure(error: 'missing scope') unless account.has_gmail_metadata_scope?
if force_full || state.history_id.blank?
full_sync!
else
incremental_sync!
end
state.update!(last_synced_at: Time.current, last_status: :success, last_error: nil)
success
rescue ::Google::GmailMetadataClient::HistoryIdExpiredError
state.update!(history_id: nil)
full_sync!
success
rescue => e
state.update!(last_status: :error, last_error: e.message)
raise
end
private
def full_sync!
page_token = nil
latest_history_id = nil
loop do
resp = client.list_messages(account:, page_token: page_token)
ingest(resp[:messages])
latest_history_id ||= resp[:messages].first&.history_id
page_token = resp[:next_page_token]
break if page_token.blank?
end
state.update!(history_id: latest_history_id) if latest_history_id
end
def incremental_sync!
page_token = nil
loop do
resp = client.list_history(account:, start_history_id: state.history_id)
ingest_history_records(resp[:history])
page_token = resp[:next_page_token]
state.update!(history_id: resp[:history_id]) if resp[:history_id]
break if page_token.blank?
end
end
def ingest(message_refs)
message_refs.each do |ref|
msg = client.get_message_metadata(account:, message_id: ref.id)
InteractionEvent.upsert(to_event_row(msg), unique_by: :remote_message_id)
end
end
def to_event_row(msg)
# Parse headers, detect direction, flag newsletter, normalize timestamp.
# See Newsletter detection section.
end
# ingest_history_records: iterate messageAdded entries, fetch metadata, upsert
end
end
end
end
Newsletter detection (runs at ingestion)
def newsletter?(msg)
headers = msg.payload.headers.to_h { |h| [h.name.downcase, h.value] }
return true if headers['list-unsubscribe'].present?
return true if headers['precedence']&.match?(/bulk|list|junk/i)
recipient_count = %w[to cc bcc].sum { |h| (headers[h] || '').split(',').size }
return true if recipient_count > 20
false
end
Rate-limit posture
Gmail API quota: 1 billion units/day/project, 250 units/user/second. messages.list costs 5 units, messages.get costs 5 units. A 24-month backfill of 20,000 messages = ~200k units (โ800 seconds if serialized, so parallelize with care). Rate limiting falls out of the CircuitBox quota + Sidekiq concurrency cap.
3.4 Google Calendar metadata client
Simpler than Gmail โ one endpoint, one sync token, one response shape.
What calendar.readonly returns
Per events.list docs:
- Incremental:
syncToken(opaque), 410 on expiry โ full sync. - Expansion:
singleEvents=trueexpands recurring events into instances. Pair withorderBy: 'startTime'. - Attendees: array with
email,displayName,responseStatus(needsAction,declined,tentative,accepted). - Pagination:
maxResultsdefault 250, max 2500.nextPageToken.
Client skeleton
# backend/lib/google/calendar_client.rb
require 'google/apis/calendar_v3'
module Google
class CalendarClient
def list_events(account:, calendar_id: 'primary', sync_token: nil, page_token: nil,
time_min: nil, max_results: 2500)
with_refresh(account: account) do
service = build_service(account: account)
service.list_events(
calendar_id,
sync_token: sync_token,
page_token: page_token,
max_results: max_results,
single_events: true,
order_by: sync_token.blank? ? 'startTime' : nil, # orderBy forbidden with sync_token
time_min: time_min, # only used for first full sync
show_deleted: true
)
end
end
# build_service, with_refresh identical pattern to gmail client
end
end
Syncer loop (410 Gone recovery)
Mirrors ContactsSyncer's existing expired_sync_token_error? pattern (contacts_syncer.rb:159):
def fetch_page(page_token:, sync_token:)
client.list_events(account:, page_token:, sync_token:)
rescue ::Google::Apis::ClientError => e
raise unless e.status_code.to_i == 410
state.update!(sync_token: nil) # trigger full resync on next call
client.list_events(account:, time_min: 24.months.ago.iso8601)
end
Backfill horizon
First full sync passes time_min: 24.months.ago.iso8601. Per DR-08 in the technical spec. Nightly sync uses the stored syncToken.
4. Microsoft 365
Greenfield ยท 2 new clients + OAuth + admin-portal UI4.1 Azure AD multi-tenant app registration
This is the prerequisite that has to happen first and is partly ops work. The Azure app is where Getro registers as a third-party consumer of Microsoft 365 data. It has to be multi-tenant so a user at any Microsoft 365 tenant can consent.
Step-by-step
- Go to Microsoft Entra admin center, App registrations โ New registration.
- Name:
Getro Production(andGetro Sandboxfor non-prod). - Supported account types: "Accounts in any organizational directory (any Microsoft Entra ID tenant โ Multitenant)" โ
signInAudience = AzureADMultipleOrgs. - Redirect URI (Web):
https://api.getro.com/auth/microsoft/callback(plus staging + dev variants). - Under Certificates & secrets โ add a client secret (store in Getro's env vars).
- Under API permissions โ Add a permission โ Microsoft Graph โ Delegated permissions:
offline_accessUser.ReadMail.ReadBasicCalendars.ReadBasicContacts.Read(optional โ only if we sync MS contacts too)Mail.Read.Shared(optional โ shared mailbox support)Calendars.Read.Shared(optional โ shared calendar support)
- Under Authentication โ enable Access tokens + ID tokens.
- Expose tenant ID =
common(ororganizationsfor enterprise-only).
common vs organizations. common accepts personal Microsoft accounts (Xbox, Hotmail) plus work/school. organizations is work/school only. Given Inovia-style targets are enterprise, organizations is the safer default โ prevents a user from connecting a personal @outlook.com mailbox and polluting signals.Admin consent
For Mail.ReadBasic, Calendars.ReadBasic, Mail.Read.Shared: individual user consent suffices. No admin consent required. (Verified in MS Graph permissions reference.)
Some enterprise tenants disable user-level consent globally via IT policy. In those cases the user sees "Need admin approval" and we show a "Request approval from your IT admin" CTA in the admin portal. Microsoft provides a standard admin-consent URL pattern: https://login.microsoftonline.com/{tenant}/adminconsent?client_id={app}&redirect_uri={cb}.
Reference docs
4.2 MicrosoftAccount model
Mirror of GoogleAccount. Same encryption pattern, same scopes-as-JSONB shape.
# db/migrate/YYYYMMDDHHMMSS_create_microsoft_accounts.rb
class CreateMicrosoftAccounts < ActiveRecord::Migration[7.2]
def change
create_table :microsoft_accounts do |t|
t.references :user, null: false, foreign_key: true
t.string :email, null: false
t.string :microsoft_uid, null: false # Graph's id for the user
t.string :tenant_id # tenant of the authenticating user
t.text :oauth_access_token
t.text :oauth_refresh_token
t.datetime :revoked_at
t.jsonb :scopes
t.integer :status, default: 0, null: false
t.timestamps
end
add_index :microsoft_accounts, [:user_id, :microsoft_uid], unique: true
end
end
# app/models/microsoft_account.rb
class MicrosoftAccount < ApplicationRecord
belongs_to :user
has_one :ms_mail_sync_state, dependent: :destroy
has_one :ms_calendar_sync_state, dependent: :destroy
encrypts :oauth_access_token
encrypts :oauth_refresh_token
MS_SCOPES = {
mail_read_basic: 'Mail.ReadBasic',
calendars_read_basic: 'Calendars.ReadBasic',
mail_read_shared: 'Mail.Read.Shared',
calendars_read_shared:'Calendars.Read.Shared',
offline_access: 'offline_access',
user_read: 'User.Read'
}.freeze
def has_refresh_token? = oauth_refresh_token.present?
def has_mail_scope? = granted_scopes.include?('Mail.ReadBasic')
def has_calendar_scope? = granted_scopes.include?('Calendars.ReadBasic')
def has_shared_mail_scope? = granted_scopes.include?('Mail.Read.Shared')
def has_shared_calendar_scope? = granted_scopes.include?('Calendars.Read.Shared')
def granted_scopes = Array(scopes)
end
4.3 OAuth flow + gem choice
No Microsoft gems are in Gemfile today. Three gem options ranked by fit:
| Option | Why pick | Why not |
|---|---|---|
| Faraday + manual OAuth (recommended) | Consistent with lib/google/oauth_client.rb pattern. Zero new gems. Full control. |
We write token exchange + refresh ourselves (~80 LOC). |
omniauth-azure-activedirectory-v2 |
Composable with existing OmniAuth infrastructure (omniauth is in Gemfile). |
Adds a gem for ~what Faraday does inline. Overkill for one integration. |
msal-ruby (community port) |
Mirrors Microsoft's official MSAL libs in other langs. | Low-maintenance status on GitHub; may lag Graph updates. Adds runtime deps. |
Recommend Faraday manual โ matches how lib/google/oauth_client.rb was built.
Client skeleton
# backend/lib/microsoft/oauth_client.rb
module Microsoft
class OauthClient
AUTHORITY = 'https://login.microsoftonline.com'
TENANT = 'organizations' # see DR in technical spec
def exchange_code_for_token(code:, redirect_uri:)
conn = Faraday.new(url: "#{AUTHORITY}/#{TENANT}") { |f| f.request :url_encoded; f.adapter Faraday.default_adapter }
resp = conn.post('/oauth2/v2.0/token',
code: code,
client_id: ENV['MICROSOFT_CLIENT_ID'],
client_secret: ENV['MICROSOFT_CLIENT_SECRET'],
redirect_uri: redirect_uri,
grant_type: 'authorization_code',
scope: 'offline_access openid email User.Read Mail.ReadBasic Calendars.ReadBasic'
)
resp.status == 200 ? JSON.parse(resp.body) : nil
end
def refresh_access_token(refresh_token:)
conn = Faraday.new(url: "#{AUTHORITY}/#{TENANT}") { |f| f.request :url_encoded; f.adapter Faraday.default_adapter }
resp = conn.post('/oauth2/v2.0/token',
refresh_token: refresh_token,
client_id: ENV['MICROSOFT_CLIENT_ID'],
client_secret: ENV['MICROSOFT_CLIENT_SECRET'],
grant_type: 'refresh_token',
scope: 'offline_access openid email User.Read Mail.ReadBasic Calendars.ReadBasic'
)
resp.status == 200 ? JSON.parse(resp.body) : nil
end
end
end
Admin-portal change
Mirror userGoogleAccountsV2.js:
// admin-portal/src/services/userMicrosoftAccountsV2.js
export const userMicrosoftAccountsApi = createApi({
reducerPath: 'userMicrosoftAccountsApi',
baseQuery,
tagTypes: ['microsoft-accounts'],
endpoints: (builder) => ({
getUserMicrosoftAccounts: builder.query({ /* GET */ }),
createUserMicrosoftAccount: builder.mutation({ /* POST { code, redirect_uri } */ }),
deleteUserMicrosoftAccount: builder.mutation({ /* DELETE */ }),
}),
});
Since there's no @react-oauth/microsoft equivalent as clean as Google's, a hand-rolled popup flow is cleanest. Or use Microsoft's MSAL.js:
import { PublicClientApplication } from '@azure/msal-browser';
const msal = new PublicClientApplication({
auth: {
clientId: import.meta.env.VITE_MS_CLIENT_ID,
authority: 'https://login.microsoftonline.com/organizations',
redirectUri: window.location.origin + '/auth/microsoft/callback',
},
});
const handleMicrosoftConnect = async (scopeChoices) => {
const scopes = ['offline_access', 'User.Read']
.concat(scopeChoices.mail ? ['Mail.ReadBasic'] : [])
.concat(scopeChoices.calendar ? ['Calendars.ReadBasic'] : [])
.concat(scopeChoices.sharedMail ? ['Mail.Read.Shared'] : [])
.concat(scopeChoices.sharedCalendar ? ['Calendars.Read.Shared'] : []);
const result = await msal.acquireTokenPopup({ scopes });
await createMicrosoftAccount({ code: result.code, redirectUri: msal.config.auth.redirectUri }).unwrap();
};
4.4 Mail.ReadBasic client
Per user-list-messages and delta query messages:
- Endpoint:
GET /me/messagesfor full listing.GET /me/mailFolders/{id}/messages/deltafor incremental (per-folder โ typically Inbox + Sent Items). - Projection:
$select=id,conversationId,from,toRecipients,ccRecipients,bccRecipients,sentDateTime,receivedDateTime,subject,internetMessageId,isRead,internetMessageHeaders - Pagination: Follow
@odata.nextLinkverbatim โ do not parse$skip. - Paging size:
Prefer: odata.maxpagesize=100(max 1000). - Delta lifecycle: GET returns
@odata.nextLink(continue) or@odata.deltaLink(round complete, save for next sync).deltaLinkexpires after ~30 days; expired token returns 410. - Deleted messages: in delta response as
{ id, "@removed": { "reason": "deleted" } }.
Client skeleton
# backend/lib/microsoft/mail_client.rb
module Microsoft
class MailClient
GRAPH = 'https://graph.microsoft.com/v1.0'
SELECT = %w[id conversationId from toRecipients ccRecipients bccRecipients
sentDateTime receivedDateTime subject internetMessageId isRead
internetMessageHeaders].join(',')
def list_messages_delta(account:, folder_id: 'Inbox', delta_link: nil, page_url: nil)
with_refresh(account: account) do
url = page_url || delta_link || "#{GRAPH}/me/mailFolders/#{folder_id}/messages/delta?$select=#{SELECT}"
resp = faraday(account: account).get(url) do |req|
req.headers['Prefer'] = 'odata.maxpagesize=100'
end
raise_for_status(resp)
json = JSON.parse(resp.body)
{
messages: json['value'],
next_link: json['@odata.nextLink'],
delta_link: json['@odata.deltaLink']
}
end
end
private
def faraday(account:)
Faraday.new do |f|
f.headers['Authorization'] = "Bearer #{account.oauth_access_token}"
f.headers['Accept'] = 'application/json'
f.adapter Faraday.default_adapter
end
end
def raise_for_status(resp)
case resp.status
when 200..299 then return
when 401 then raise Microsoft::AuthError
when 410 then raise Microsoft::DeltaTokenExpired
when 429 then raise Microsoft::Throttled.new(retry_after: resp.headers['Retry-After'].to_i)
else raise Microsoft::ApiError.new("#{resp.status}: #{resp.body}")
end
end
# with_refresh: same pattern as Google โ refresh on 401, wrap in CircuitBox
end
end
Throttling
Per MS Graph throttling docs: 429 responses include Retry-After (seconds). Respect it. Our Microsoft::Throttled exception carries the value; the syncer catches and schedules a delayed retry via Sidekiq's built-in sidekiq_retry_in.
Syncer skeleton (delta loop)
module Emails
module Import
module Microsoft
class MailSyncer < ApplicationService
FOLDERS = %w[Inbox SentItems].freeze
def initialize(account:)
@account = account
@client = ::Microsoft::MailClient.new
end
def call
FOLDERS.each { |folder| sync_folder(folder) }
success
end
private
def sync_folder(folder_id)
state = MsMailSyncState.find_or_create_by!(microsoft_account: account, folder_id: folder_id)
page_url = nil
delta_link = state.delta_link
loop do
resp = @client.list_messages_delta(account:, folder_id: folder_id, delta_link: delta_link, page_url: page_url)
ingest(resp[:messages])
if resp[:next_link]
page_url = resp[:next_link]
delta_link = nil
elsif resp[:delta_link]
state.update!(delta_link: resp[:delta_link])
break
else
break
end
end
rescue ::Microsoft::DeltaTokenExpired
state.update!(delta_link: nil)
retry
rescue ::Microsoft::Throttled => e
raise Sidekiq::Throttled::Error.new(retry_in: e.retry_after)
end
end
end
end
end
4.5 Calendars.ReadBasic client
Per user-list-events:
- Endpoint:
GET /me/eventsfor listing. Delta:GET /me/calendarView/delta(note: events.delta at the calendar level, not/me/events/delta). - Fields:
id, subject, organizer, attendees[], start, end, isAllDay, isCancelled. - Attendee responseStatus:
none,organizer,tentativelyAccepted,accepted,declined,notResponded. - Timezone: send
Prefer: outlook.timezone="UTC"on every request โ avoid per-user timezone drift. - Projection:
$select=id,subject,organizer,attendees,start,end,isCancelled. NoteCalendars.ReadBasicalready excludes body/attachments so$selectis optimization not privacy.
Attendee filtering
For scoring, we count attendees where responseStatus.response != 'declined'. We skip cancelled events entirely (isCancelled: true).
Client skeleton
# backend/lib/microsoft/calendar_client.rb
module Microsoft
class CalendarClient
GRAPH = 'https://graph.microsoft.com/v1.0'
SELECT = %w[id subject organizer attendees start end isAllDay isCancelled].join(',')
def list_events_delta(account:, delta_link: nil, page_url: nil, start_date_time: nil, end_date_time: nil)
url = page_url || delta_link || build_initial_url(start_date_time, end_date_time)
with_refresh(account: account) do
resp = faraday(account: account).get(url) do |req|
req.headers['Prefer'] = 'outlook.timezone="UTC", odata.maxpagesize=100'
end
raise_for_status(resp)
parse_response(resp)
end
end
private
def build_initial_url(start_dt, end_dt)
start_dt ||= 24.months.ago.iso8601
end_dt ||= 1.year.from_now.iso8601
"#{GRAPH}/me/calendarView/delta?startDateTime=#{CGI.escape(start_dt)}&endDateTime=#{CGI.escape(end_dt)}&$select=#{SELECT}"
end
end
end
6. Cross-provider concerns
Self-emails / team-member-as-contact
Emailing a colleague shouldn't make them look like a Contact. Before inserting InteractionEvent:
return if User.where(email: contact_email, collection_id: account.collections.select(:id)).exists?
Same check for both providers. The resolver never attempts to match internal team emails to Contacts.
BCC semantics
Each provider exposes BCC only on the sender's own messages:
- Gmail: BCC header appears on messages where the user is the
From, never on received messages. - MS Graph:
bccRecipientsis returned only when the signed-in user is the sender.
When bccRecipients is present on an outbound, attribute interaction to each BCC recipient (same as To + Cc).
Timezone handling
- Gmail:
internalDateis always Unix epoch ms UTC. No ambiguity. - Google Calendar:
start.dateTime+start.timeZone. Always normalize to UTC before storing. - MS Graph Mail:
sentDateTime/receivedDateTimeare ISO-8601 UTC. Store as-is. - MS Graph Calendar:
startis adateTimeTimeZoneobject. SendPrefer: outlook.timezone="UTC"on every request to force UTC responses.
Newsletter / bulk inbound filter
Header inspection works on both providers; the header names are RFC standards, not provider-specific:
- Gmail: call
messages.get(format: 'metadata')withmetadata_headers: ['List-Unsubscribe', 'Precedence']. - MS Graph:
internetMessageHeadersfield in the$selectprojection.
Apply identical logic (see Gmail section for detection predicate).
Attribution coverage
Independent of provider, coverage depends on Findem's ability to resolve the correspondent email to a canonical identity. Both providers produce the same email addresses; the dedup pipeline treats them identically.
7. Operational envelope (per-provider)
Google โ Gmail
- Scope
gmail.metadataโ approved- Daily quota
- 1B units / project
- Per-user
- 250 units/sec
- Incremental
history.list, โฅ1 week valid- List cost
- 5 units per
messages.list, 5 permessages.get
Google โ Calendar
- Scope
calendar.readonly(declared)- Incremental
syncToken, 410 on expiry- Page size
- default 250, max 2500
- Backfill horizon
- 24 months via
timeMin
Microsoft โ Mail
- Scope
Mail.ReadBasic- Admin consent
- Not required
- Incremental
- Per-folder delta,
@odata.deltaLink - Page size
- default 10, max 1000; use
Prefer: odata.maxpagesize=100 - Throttling
- 429 +
Retry-After
Microsoft โ Calendar
- Scope
Calendars.ReadBasic- Admin consent
- Not required
- Incremental
/me/calendarView/delta- Timezone
Prefer: outlook.timezone="UTC"
Microsoft โ shared mailbox
- Scope
Mail.Read.Shared- Admin consent
- Not required
- Endpoint
/users/{upn}/messages- Prerequisite
- User has FullAccess in Exchange
Google โ shared via DWD
- Flow
- Service account + domain-wide delegation
- Admin consent
- Required (per-domain)
- V1 status
- Deferred (see matrix)
8. Decisions to confirm before execution
- MS tenant audience:
organizations(work/school only) vs.common(also personal MS accounts). Recommendorganizationsโ enterprise-only prevents personal-account pollution. - Shared mailbox scope in V1: ship Google Groups + M365 shared (both low-cost), defer Gmail delegation. Confirm.
- Backfill horizon: 24 months on first connect (per DR-08). Confirm for both providers.
- MS OAuth gem choice: Faraday manual (recommended) vs.
omniauth-azure-activedirectory-v2. Affects ~80 LOC. - Gmail incremental strategy:
history.listfrom day 1 (recommended โ matches MS delta pattern) vs. always full list + diff. History API is more efficient. - Personal MS account blocking: if we go with
organizations, we need UI copy for users who try to connectuser@outlook.comand hit an error. - MS rate-limit handling in Sidekiq: inline retry via
sidekiq_retry_invs. dedicated throttle queue. Inline is simpler.
Next step
Once these decisions are made, the phased plan in the technical spec is ready to convert into JIRA tickets. No external blockers remain except the Azure tenant verification.