Reddit Comment Thread Scraper API
Point our Reddit comment scraper at one comment permalink and it returns that comment with its entire descendant reply subtree nested as structured JSON, each node carrying its real score, depth, author, and parent id.
Why Reddit Comment Thread data is hard to collect
Reading a single comment branch from Reddit's Data API means registering an OAuth app and staying inside the free ceiling of 100 queries per minute, the limit set when Reddit started charging $0.24 per 1,000 calls on July 1, 2023. A deep reply chain rarely arrives in one response either: collapsed branches sit behind more nodes that each cost a separate /api/morechildren request to expand, so reconstructing one full thread can take a dozen round trips you have to orchestrate and rate-limit yourself.
Call the Reddit Comment Thread Scraper API in one request
curl "https://api.redditscraperapi.com/api/v1/reddit/comment?url=https://www.reddit.com/r/python/comments/1abc234/favorite_library/kf09xz2/&sort=top&api_key=$API_KEY" import requests
resp = requests.get(
"https://api.redditscraperapi.com/api/v1/reddit/comment",
params={
# a comment permalink, or post_id + comment_id + subreddit
"url": "https://www.reddit.com/r/python/comments/1abc234/favorite_library/kf09xz2/",
"sort": "top",
"api_key": "YOUR_API_KEY",
},
)
data = resp.json()
root = data["comment"]
def walk(node, level=0):
print(" " * level + f"[{node['score']}] {node['author']['username']}: {node['body'][:80]}")
for reply in node["replies"]:
walk(reply, level + 1)
walk(root) Parameters
| Parameter | Required | Default | Notes |
|---|---|---|---|
url | optional | - | A comment permalink in either form: /r/{sub}/comments/{post_id}/comment/{comment_id}/ or /r/{sub}/comments/{post_id}/{slug}/{comment_id}/. We parse the subreddit, post id, and comment id out of it, so a link copied from the browser works. Pass this or the post_id + comment_id pair. |
post_id | optional | - | The parent post's base-36 id, with or without the t3_ prefix (for example 1abc234). Pair it with comment_id and subreddit when you are not passing a url. |
comment_id | optional | - | The target comment's base-36 id, with or without the t1_ prefix (for example kf09xz2). This comment becomes the root of the returned subtree. |
subreddit | optional | - | The subreddit the thread lives in, without the leading r/. Required when you pass ids instead of a url, since the comment surface is keyed by /r/{sub}/. |
sort | optional | top | Comment sort of the underlying thread fetch: top, new, or controversial. It shapes which sibling branches are loaded first while we locate your comment. |
Fields the Reddit Comment Thread Scraper API returns
{
"comment": {
"id": "t1_kf09xz2",
"parent_id": null,
"depth": 0,
"score": 412,
"author": { "username": "pandas_fan", "id": "t2_8x1q4" },
"body": "requests, every single time. Nothing else reads this cleanly.",
"created": "2026-05-31T14:40:51+00:00",
"permalink": "https://www.reddit.com/r/python/comments/1abc234/_/kf09xz2/",
"post_id": "t3_1abc234",
"replies": [
{
"id": "t1_kf0a7m4",
"parent_id": "t1_kf09xz2",
"depth": 1,
"score": 88,
"author": { "username": "async_pete", "id": "t2_2m9pd" },
"body": "httpx if you need async though.",
"created": "2026-05-31T14:52:19+00:00",
"permalink": "https://www.reddit.com/r/python/comments/1abc234/_/kf0a7m4/",
"replies": []
}
]
},
"post_ref": {
"id": "t3_1abc234",
"subreddit": "python",
"permalink": "https://www.reddit.com/r/python/comments/1abc234/"
},
"_meta": {
"source": "svc-comments+more-comments",
"sort": "top",
"pages_fetched": 2,
"truncated": false
}
} | Field | Type | Description |
|---|---|---|
comment | object | The target comment as the root of its reply subtree, carrying every field below plus a recursive replies array of the same shape. |
comment.id | string | The comment's t1_ thing id, unique across Reddit. |
comment.parent_id | string or null | The t1_ id of the comment this one answers. It is null when the requested comment is itself the visible root of the returned subtree. |
comment.depth | integer | Nesting level read from Reddit's markup. The deeper a reply sits in the original thread, the higher this number. |
comment.score | integer | Net upvotes (upvotes minus downvotes) on the comment at request time, read from Reddit's live rendered markup so the value reflects real votes. |
comment.author | object | The commenter as { username, id }, where id is the t2_ account id when Reddit exposes it. username is [deleted] for removed accounts. |
comment.body | string | The comment text decoded to plain readable text. |
comment.created | string | ISO-8601 timestamp of when the comment was posted. |
comment.permalink | string (url) | Absolute www.reddit.com link straight to the comment. |
comment.post_id | string | The t3_ id of the post the comment belongs to. |
comment.replies | array | Direct child replies, each one a comment object with its own replies array, so the whole branch is already nested for you. |
post_ref | object | A lightweight pointer to the parent thread: { id, subreddit, permalink }, enough to link back without pulling the full post. |
_meta | object | Run detail: source surfaces used, the sort applied, pages_fetched, and a truncated flag when deeper continuations were left unfetched. |
What you can build with it
Conversation thread export
Q and A pair extraction
LLM and RAG datasets
Reply sentiment tracking
Moderation review
Deep branch recovery
Why developers pick our Reddit Comment Thread Scraper API
Our Reddit comment scraper runs the residential fetch, anti-bot handling, and parsing on our infrastructure, walks the live comment surface and any load-more continuations, then hands back your target comment with its full descendant subtree already nested and deduplicated. A comment permalink goes in and the branch comes out in about 2.6 seconds, with no OAuth app to register and a free tier of 1,000 requests.
No OAuth app of your own
Full reply subtree, already nested
Real scores per node
Load-more continuations followed
Anti-bot and residential proxies
Pay for success
Reddit Comment Thread Scraper API compared to the official Reddit API
| Factor | Our Reddit comment scraper API | DIY API plus morechildren | Reddit Data API |
|---|---|---|---|
| Setup | One api_key | OAuth app, token refresh, parser | Registered OAuth app plus token |
| Rate limit | Plan request limit only | 100 QPM you have to throttle | 100 queries per minute, OAuth-gated |
| Deep replies | Subtree fetched and merged | One morechildren call per branch | more nodes you expand yourself |
| Output shape | Comment with nested replies | Flat listings you re-tree | Nested Listing JSON to traverse |
| Vote scores | Net score on every node | Present while quota lasts | Included while quota lasts |
| Anti-bot and proxies | Handled for you | Not your concern with OAuth | Not applicable inside the API |
| Input | Comment permalink or ids | Comment id list per request | Article id plus child ids |
Free to start, priced to scale
| Plan | Price | Best for |
|---|---|---|
| Free | 1,000 requests | Testing and small jobs |
| Pro | $0.60 / 1k | Production workloads |
| Pay-as-you-go | $0.90 / 1k | Spiky or one-off volume |
Median response 2.6s. You only pay for successful requests.
FAQ
Send the comment's permalink to our reddit/comment endpoint with your api_key, or pass post_id, comment_id, and subreddit instead of a url. The response returns that comment as the root of a tree, with every descendant reply nested inside a replies array and each node carrying its score, depth, author, body, created timestamp, and permalink. You do not register a Reddit OAuth app or run any scraping infrastructure of your own.
No. Reddit's Data API requires a registered OAuth app and caps free access at 100 queries per minute per client id, a limit set when Reddit began charging $0.24 per 1,000 calls on July 1, 2023. Our endpoint fetches and parses the comment thread for you behind a single api_key, so there is no app to register, no client secret, and no per-minute approval to apply for. The free tier covers 1,000 requests.
The post endpoint takes a post and returns the submission plus its top-level comment list. This comment endpoint takes one specific comment and returns that comment plus its full descendant reply subtree, nested. Use the post endpoint to scan a whole thread, and use this one when you care about a single branch and everything posted beneath it.
Yes. The target comment comes back as the root object, and its direct children live in a replies array where every entry is another comment with its own replies. Depth and parent_id are filled on each node too, so you can read the tree as-is, flatten it, or group a comment with its children without rebuilding the structure yourself.
Yes. For scraping Reddit comments in Python, call the endpoint with the requests library, read resp.json(), and recurse through data['comment']['replies'] to print or store the whole branch. The quickstart above is a complete working Python example, and any language that can make an HTTP GET request works the same way, since the response is plain JSON. If you also want to pull Reddit posts and comments together across a whole thread, the post endpoint returns the submission with its top-level comments scraper output.
If the target comment was removed, deleted, or sits deeper than the pages we fetch, the call returns a clean not-found response so you never get back a fabricated branch. When replies are collapsed behind a load more comments link, we follow those continuations and merge them into the tree, and the _meta.truncated flag tells you if any deeper branch was left unfetched.
The free tier covers 1,000 requests, Pro pricing runs about $0.60 per 1,000 requests, and pay-as-you-go top-ups are $0.90 per 1,000 successful requests. Responses return in a median of 2.6 seconds, and you are billed for successful results.