How the cursor works
- Each response returns a
body.pagination.nextCursor. It is an opaque, base64url-encoded token. Do not parse, decode, or construct cursors client-side — pass them back verbatim as thecursorquery parameter on the next request. - Each cursor encodes the boundary row’s
(cursor_field_value, _id). The cursor field varies by dataset — see the table on the Overview page. For most datasets it isupdated_at; forD21andD23it iscommunicated_at. - Cursors are stable: passing the same cursor twice returns the same page.
- When
nextCursorisnull, you have reached the end of the data for your current filter.
Load patterns
Initial load pattern
Initial load pattern
Omit
since. Walk forward by passing back nextCursor until it returns null.Incremental load pattern
Incremental load pattern
Store
max(updated_at) (or whichever cursor field that dataset uses) from the last full load. On the next run, pass that timestamp as since. Walk forward as above.since is treated as an inclusive lower bound (>=) and until as an exclusive upper bound (<) on the cursor field. If your client is unsure how a range was interpreted, treat the body.meta.since and body.meta.until values echoed back in the response as authoritative.Variable page size on some datasets
Variable page size on some datasets
Datasets in S4 Communications (and some in S3) are filtered to your tenant through a parent collection using an internal
$lookup join. For those, a single response can contain fewer rows than your limit even when more data exists beyond the page. The cursor still advances correctly — keep walking until nextCursor is null.Do not infer “end of data” from rows.length < limit. Only nextCursor: null reliably means the end.Affected datasets (TIER 2 — variable page size): D03, D17, D18, D19, D21, D22, D23. You can tell which datasets behave this way from body.meta.tenantPathKind — a value of "via" indicates the join-based filter, while "direct" datasets return a constant page size until the last page.Window the largest via datasets. A few of these — notably D23 (tens of millions of rows) and D03 — are large enough that an unbounded request (no since/until) can exceed the API gateway timeout and return a 504. For these, always pass a time window and walk forward in slices (e.g. month-by-month), paging the cursor to null within each slice before advancing to the next. A smaller limit also helps each page return faster.Sort order and tiebreaking
Sort order and tiebreaking
Rows are sorted by the dataset’s cursor field ascending by default. Pass
sort=desc to reverse. Ties on the cursor field are broken by _id ascending (or descending, matching sort). The cursor encodes both the cursor-field value and _id, so pagination stays stable even when many rows share the same timestamp.Resuming after an interrupted load
Resuming after an interrupted load
Persist the most recent
nextCursor after each page has been fully processed. On restart, resume by passing it back as cursor. There is no per-cursor TTL — a cursor stored for a week and resumed later will still work.Filter-wide totals (optional)
Filter-wide totals (optional)
Pass
?includeTotal=true on the fetch endpoint to get pagination.total — the total number of rows that match your filter, independent of cursor and limit. Useful for progress tracking and capacity planning.Three things to know:via(TIER 2) datasets require a time window. For any dataset filtered through a parent collection —D03,D17,D18,D19,D21,D22,D23— you must passsinceand/oruntil, otherwisetotalcomes back asnullwithtotalNote: "window_required". This guards against an unbounded join.- Counts are capped at a server-side limit. For very large result sets (e.g. millions of rows in a wide time window),
totalis returned as a lower bound withtotalExact: falseandtotalNote: "capped". Narrow the time window for an exact count. - Counts have a wall-clock budget. If the count query exceeds the internal timeout,
totalisnullwithtotalNote: "timeout". The rest of the response (rows,nextCursor) is unaffected — only the count failed. Retry later or narrow the window.
includeTotal=true on every page request — it’s an extra query. Recommended pattern: pass it once at the start of the load to size the work, then omit it on subsequent pages.