How-To: Use QueryJobs API Pagination

The QueryJobs API returns a 200-event result buffer for filter (non-aggregate) queries. To retrieve all matching events, you must use cursor-based pagination via the around parameter.

Scripts for both LogScale and NextGen SIEM are available for download:

Script Usage

LogScale (direct API)

shell

# One-time setup
python3 -m venv .venv
source .venv/bin/activate
pip install requests

# Environment variables (required)
export LOGSCALE_TOKEN="your-api-token"
export LOGSCALE_BASE_URL="https://your-logscale-instance.com/"
export LOGSCALE_REPO="your-repo-name"

# Run (wrap query in single quotes to preserve double quotes in CQL)
python queryjob_paginator.py -q '#event_simpleName=ProcessRollup2'

# With double quotes in the query
python queryjob_paginator.py -q '#event_simpleName=ProcessRollup2 | groupBy([aid], function=[count(as="Event Count")]) | sort("Event Count", limit=max)'

# Optional flags
python queryjob_paginator.py -q '...' -s 1h -e now -o results.json --max-events 5000 --page-size 200

# Help
python queryjob_paginator.py -h

Available parameters:

Flag	Description	Default
-q, --query	CQL query string (required)	—
-s, --start	Start time	15m
-e, --end	End time	now
-o, --output	Output JSON file	queryjob_results.json
--max-events	Max events to retrieve	unlimited
--page-size	Events per cursor page	200

NG-SIEM (FalconPy)

shell

# One-time setup (same venv)
pip install crowdstrike-falconpy

# Environment variables (required)
export FALCON_CLIENT_ID="your-client-id"
export FALCON_CLIENT_SECRET="your-client-secret"
export FALCON_BASE_URL="https://api.us-2.crowdstrike.com"  # optional, defaults to US-1
export CA_BUNDLE="/path/to/ca-bundle.pem"                  # optional, for corporate proxy

# Run
python ngsiem_queryjob_paginator.py

Configuration is in the script's Configuration section (REPO, QUERY_STRING, START, END, PAGE_SIZE, MAX_EVENTS).

Required API scope: NGSIEM: Read + Write

How QueryJobs API Works

The QueryJobs API flow is as follows:

Create a QueryJob → returns a job ID
Poll (GET) → returns up to 200 events + metadata
Check metadata → hasMoreEvents="true" means more events exist beyond the buffer
Paginate using the around parameter to walk through remaining events

Key Metadata Fields

Field	meaning
`metaData.resultBufferSize`	Events in the buffer (default 200 for filter queries)
`metaData.eventCount`	Number of events in current result set
`metaData.processedEvents`	Total matching events found by the query
`metaData.extraData.hasMoreEvents`	"true" (string!) if results exceed buffer
`metaData.isAggregate`	Aggregate queries return all results in one shot
`metaData.pollAfter`	Milliseconds to wait before next poll

Pagination Mechanisms

Offset/Limit (within the buffer only) - Query parameters on the GET poll request:
- ?paginationLimit=50&paginationOffset=0 - page within the 200-event buffer
- Only useful for paging within what's already buffered, not for getting more events
Cursor-Based (around parameter) (for results beyond the buffer) - The around parameter creates a new QueryJob anchored on a specific event:
json
```
{
  "queryString": "#event_simpleName=ProcessRollup2",
  "start": "15m",
  "end": "now",
  "around": {
    "eventId": "@id of anchor event",
    "timestamp": 1777469793821,
    "numberOfEventsBefore": 200,
    "numberOfEventsAfter": 0
  }
}
```
Critical detail: LogScale returns newest events first. The last event in the buffer is the oldest. To get more events, anchor on the oldest event and request numberOfEventsBefore (older events).
What Does NOT Work:
- Automatic cursor advancement - the server does NOT track which segments you've consumed. Repeated polls return the same 200 events.
- dataspaces endpoint - legacy alias; use /api/v1/repositories/ instead.
- Offset/limit beyond the buffer - paginationOffset only pages within resultBufferSize, not beyond it.

Pagination algorithm:

Create QueryJob, poll until done=true
Collect initial 200 events
If hasMoreEvents="true":
1. Take the LAST event (oldest, since results are newest-first)
2. Create NEW QueryJob with around:
  - eventId = last event's @id
  - timestamp = last event's @timestamp
  - numberOfEventsBefore = 200
  - numberOfEventsAfter = 0
3. Poll new job until done, collect events
4. Deduplicate by @id (boundary event may repeat)
5. Repeat from step 1 ("Take the LAST event...") until no new events returned

API Endpoints

LogScale (direct API)

Action	Endpoint
Create	`POST /api/v1/repositories/{repo}/queryjobs`
Poll	`GET /api/v1/repositories/{repo}/queryjobs/{id}`
Cancel	`DELETE /api/v1/repositories/{repo}/queryjobs/{id}`

NG-SIEM (via CrowdStrike API gateway)

Action	Endpoint
Create	`POST /humio/api/v1/repositories/{repo}/queryjobs`
Poll	`GET /humio/api/v1/repositories/{repo}/queryjobs/{id}`
Cancel	`DELETE /humio/api/v1/repositories/{repo}/queryjobs/{id}`

NG-SIEM repositories/views: search-all, investigate_view, third-party, falcon_for_it_view, forensics_view

Important Notes

hasMoreEvents is a string ("true" / "false"), not a boolean
done=true means the query finished, NOT that all results are delivered
QueryJobs auto-delete after 90 seconds of no polling
Aggregate queries (groupBy(), count(), etc.) return all results in one response - no pagination needed
Rate limit: 6000 concurrent query jobs per CID

SSL / Corporate Proxy

For Zscaler environments, set CA_BUNDLE env var pointing to the CA certificate bundle. The FalconPy script reads this and passes it as ssl_verify. If the bundle doesn't cover the API endpoint, use ssl_verify=False.

Versions of this Page

Guidance

Other articles on this topic

Deprecated Feature RN Entries

Enter search term

Guidance