AUSynth API Reference

Authentication

All API requests require a Bearer token. Obtain an API key from your account dashboard at ausynth.com/account.

Pass the key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

API keys are scoped to your account and inherit your credit balance and tier permissions. Keep your key confidential; anyone with the key can spend your credits.

Base URL

https://api.ausynth.com/v1

All endpoints below are relative to this base URL.

Endpoints

POST /query/preview

Preview the cost and feasibility of a query without executing it or spending credits.

Request Body

{
  "geography_level": "suburb",
  "geography_selections": ["Paddington (QLD)"],
  "dataset_type": "persons",
  "n_observations": 1000,
  "output_format": "parquet",
  "include_geography": "none"
}

Field	Type	Required	Description
`geography_level`	string	Yes	One of: `suburb`, `postcode`, `lga`, `state`, `australia`
`geography_selections`	array of strings	Yes	List of geographic identifiers at the specified level. Suburb names that exist in multiple states must include the state in parentheses, e.g., `"Paddington (QLD)"`.
`dataset_type`	string	Yes	One of: `persons`, `families`, `dwellings`
`n_observations`	integer	Yes	Number of records to generate per geography selection. Must be ≤ pool size for the smallest selected suburb.
`output_format`	string	No	One of: `parquet`, `csv`, `xlsx`. Default: `parquet`.
`include_geography`	string	No	One of: `none`, `hierarchical`. Default: `none`. Hierarchical adds suburb, postcode, LGA, GCCSA, and state columns to each record. Billed at 1.5× standard rate.

Response (200)

{
  "valid": true,
  "n_observations": 1000,
  "credit_cost": 2,
  "pool_available": 8547,
  "geography_matches": [
    {
      "suburb_id": "Paddington (QLD)",
      "pool_size": 8547,
      "small_suburb_flag": false
    }
  ],
  "warnings": []
}

Field	Type	Description
`valid`	boolean	Whether the query can be executed
`n_observations`	integer	Confirmed observation count
`credit_cost`	integer	Credits that would be deducted. 1 credit = 500 observations.
`pool_available`	integer	Total records available across all selected geographies
`geography_matches`	array	Per-geography match details including pool size and small suburb flag
`warnings`	array of strings	Non-fatal warnings (e.g., small suburb advisory)

Response (422); Invalid query

{
  "valid": false,
  "detail": "n_observations (5000) exceeds pool size (2341) for suburb 'Inala'",
  "geography_matches": []
}

POST /query/execute

Execute a query and receive a download URL. Deducts credits from your account.

Request Body

Same schema as /query/preview.

Response (200)

{
  "query_id": "q_abc123def456",
  "status": "completed",
  "credit_cost": 2,
  "credits_remaining": 48,
  "download_url": "https://data.ausynth.com/downloads/q_abc123def456.parquet",
  "download_expires": "2026-05-01T00:00:00Z",
  "metadata": {
    "geography_level": "suburb",
    "geography_selections": ["Paddington (QLD)"],
    "dataset_type": "persons",
    "n_observations": 1000,
    "output_format": "parquet",
    "include_geography": "none",
    "version": "1.0",
    "generated_at": "2026-04-28T10:30:00Z"
  }
}

Field	Type	Description
`query_id`	string	Unique identifier for this query. Use for history lookup.
`status`	string	`completed` on success
`credit_cost`	integer	Credits deducted
`credits_remaining`	integer	Account balance after deduction
`download_url`	string	Signed URL to download the result file. Valid for 7 days.
`download_expires`	string (ISO 8601)	When the download URL expires
`metadata`	object	Echo of query parameters plus version and timestamp

Important: Each execution samples from the pool without replacement. Repeating the same query will return a different subset of records. This is by design. It supports multiple imputation workflows.

Response (402); Insufficient credits

{
  "detail": "Insufficient credits. Required: 10, available: 3.",
  "credits_remaining": 3
}

GET /query/history

List your past queries with pagination.

Query Parameters

Parameter	Type	Required	Description
`page`	integer	No	Page number (1-indexed). Default: 1.
`per_page`	integer	No	Results per page. Default: 20. Max: 100.
`dataset_type`	string	No	Filter by dataset type

Response (200)

{
  "queries": [
    {
      "query_id": "q_abc123def456",
      "executed_at": "2026-04-28T10:30:00Z",
      "dataset_type": "persons",
      "geography_selections": ["Paddington (QLD)"],
      "n_observations": 1000,
      "credit_cost": 2,
      "download_url": "https://data.ausynth.com/downloads/q_abc123def456.parquet",
      "download_expired": false
    }
  ],
  "pagination": {
    "page": 1,
    "per_page": 20,
    "total_pages": 3,
    "total_queries": 47
  }
}

Download URLs for queries older than 7 days will have download_expired: true. Re-execute the query to generate a fresh download (this will deduct credits again and produce a new sample).

GET /account/credits

Check your current credit balance and tier.

Response (200)

{
  "credits_remaining": 48,
  "tier": "professional",
  "credits_purchased": 100,
  "credits_used": 52,
  "free_credits_remaining": 0,
  "free_credits_reset": null
}

For free-tier accounts, free_credits_remaining shows the weekly allocation and free_credits_reset gives the next reset time (ISO 8601).

GET /geography/search

Search for suburbs, postcodes, LGAs, or states by name.

Query Parameters

Parameter	Type	Required	Description
`query`	string	Yes	Search term (partial match supported)
`state`	string	No	Filter by state abbreviation (NSW, VIC, QLD, SA, WA, TAS, NT, ACT)
`level`	string	No	Geographic level to search. Default: `suburb`. One of: `suburb`, `postcode`, `lga`, `state`.
`limit`	integer	No	Max results. Default: 10. Max: 50.

Response (200)

{
  "results": [
    {
      "suburb_id": "Paddington (NSW)",
      "state": "NSW",
      "postcode": "2021",
      "lga": "Woollahra",
      "gccsa": "Greater Sydney",
      "pool_size": 14201,
      "small_suburb_flag": false,
      "datasets_available": ["persons", "families", "dwellings"]
    },
    {
      "suburb_id": "Paddington (QLD)",
      "state": "QLD",
      "postcode": "4064",
      "lga": "Brisbane",
      "gccsa": "Greater Brisbane",
      "pool_size": 8547,
      "small_suburb_flag": false,
      "datasets_available": ["persons", "families", "dwellings"]
    }
  ],
  "total_matches": 2
}

Suburb names that exist in multiple states include the state in parentheses. Unique suburb names do not. For example, "Toorak" exists only in VIC, so it appears as "Toorak" without a state qualifier.

GET /geography/{suburb_id}/variables

Retrieve the list of available variables and their category counts for a specific suburb and dataset.

Path Parameters

Parameter	Type	Description
`suburb_id`	string	Suburb identifier from geography search

Query Parameters

Parameter	Type	Required	Description
`dataset_type`	string	Yes	One of: `persons`, `families`, `dwellings`

Response (200)

{
  "suburb_id": "Paddington (QLD)",
  "dataset_type": "persons",
  "variables": [
    {"code": "AGE5P", "description": "Age in Five Year Groups", "n_categories": 21},
    {"code": "SEXP", "description": "Sex", "n_categories": 2},
    {"code": "INCP", "description": "Total Personal Income (weekly)", "n_categories": 17}
  ]
}

Error Codes

HTTP Status	Code	Description
200	;	Success
400	`bad_request`	Malformed request body or missing required fields
401	`unauthorized`	Missing or invalid API key
402	`insufficient_credits`	Not enough credits for this query
404	`not_found`	Geography or query ID not found
422	`validation_error`	Valid JSON but logically invalid query (e.g., n_observations exceeds pool)
429	`rate_limited`	Too many requests. See Rate Limits.
500	`internal_error`	Server error. Retry with backoff.

All error responses include a detail field with a human-readable explanation:

{
  "error": "validation_error",
  "detail": "Unknown suburb: 'Paddingtn'. Did you mean 'Paddington (QLD)'?"
}

Rate Limits

60 requests per minute per API key, measured with a sliding window. When exceeded, the API returns HTTP 429 with a Retry-After header indicating seconds to wait:

HTTP/1.1 429 Too Many Requests
Retry-After: 12

Best practice: implement exponential backoff starting at 1 second, with a maximum wait of 60 seconds.

Code Examples

Python (requests)

import requests
import pandas as pd

API_KEY = "your-api-key"
BASE = "https://api.ausynth.com/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Preview
query = {
    "geography_level": "suburb",
    "geography_selections": ["Toorak"],
    "dataset_type": "persons",
    "n_observations": 2000,
    "output_format": "parquet"
}

preview = requests.post(f"{BASE}/query/preview", json=query, headers=HEADERS).json()
print(f"Cost: {preview['credit_cost']} credits")

# Execute
result = requests.post(f"{BASE}/query/execute", json=query, headers=HEADERS).json()
df = pd.read_parquet(result["download_url"])
print(df.describe())

R (httr)

library(httr)
library(jsonlite)
library(arrow)

api_key <- "your-api-key"
base_url <- "https://api.ausynth.com/v1"

query <- list(
  geography_level = "suburb",
  geography_selections = list("Toorak"),
  dataset_type = "persons",
  n_observations = 2000,
  output_format = "parquet"
)

# Preview
preview <- POST(
  paste0(base_url, "/query/preview"),
  add_headers(Authorization = paste("Bearer", api_key)),
  body = query, encode = "json"
) |> content(as = "parsed")

cat(sprintf("Cost: %d credits\n", preview$credit_cost))

# Execute
result <- POST(
  paste0(base_url, "/query/execute"),
  add_headers(Authorization = paste("Bearer", api_key)),
  body = query, encode = "json"
) |> content(as = "parsed")

df <- read_parquet(result$download_url)
summary(df)

curl

# Preview
curl -X POST https://api.ausynth.com/v1/query/preview \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "geography_level": "suburb",
    "geography_selections": ["Toorak"],
    "dataset_type": "persons",
    "n_observations": 2000,
    "output_format": "csv"
  }'

# Execute
curl -X POST https://api.ausynth.com/v1/query/execute \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "geography_level": "suburb",
    "geography_selections": ["Toorak"],
    "dataset_type": "persons",
    "n_observations": 2000,
    "output_format": "csv"
  }'

JavaScript (fetch)

const API_KEY = "your-api-key";
const BASE = "https://api.ausynth.com/v1";

const query = {
  geography_level: "suburb",
  geography_selections: ["Toorak"],
  dataset_type: "persons",
  n_observations: 2000,
  output_format: "csv"
};

// Preview
const preview = await fetch(`${BASE}/query/preview`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${API_KEY}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify(query)
}).then(r => r.json());

console.log(`Cost: ${preview.credit_cost} credits`);

// Execute
const result = await fetch(`${BASE}/query/execute`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${API_KEY}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify(query)
}).then(r => r.json());

console.log(`Download: ${result.download_url}`);

Best Practices

Cache the geography index locally. The geography search endpoint is rate-limited like all others. If you routinely query the same suburbs, cache the suburb identifiers and pool sizes rather than searching each time.

Use the preview endpoint before execute. Preview validates your query, confirms pool availability, and reports the credit cost; all without spending credits. This is especially useful when building queries programmatically.

Handle 429 responses with exponential backoff. At 60 requests per minute, batch workflows can hit the rate limit. Implement retry logic with exponential backoff (1s, 2s, 4s, …) up to a maximum of 60 seconds.

Request Parquet for large downloads. Parquet files are substantially smaller than CSV for the same data, load faster, and preserve column types. Use CSV or XLSX only when you need compatibility with tools that do not support Parquet.

Use multiple imputations for statistical inference. Each query returns a different sample from the suburb's pool. Run your analysis on 5–20 independent samples and combine results using Rubin's rules to account for sampling variability.

API Changelog

v1 (May 2026)

Initial API release supporting query preview, execute, history, credits, and geography search. Output formats: Parquet, CSV, XLSX. Hierarchical geography option. Without-replacement sampling from suburb pools.

See also: Quick Start; Python · Quick Start; R · FAQ