-
-
Notifications
You must be signed in to change notification settings - Fork 186
feat: add server side atproto blog apis #841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add server side atproto blog apis #841
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
990ecfb to
9af4c33
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
📝 WalkthroughWalkthroughAdds Bluesky / AT-protocol integration and blog tooling. Introduces two cached server API endpoints: Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 1✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (5)
server/api/atproto/author-profiles.get.ts (2)
44-46: Redundant array check after Valibot parse.After
v.parse(v.array(AuthorSchema), parsed)succeeds,authorsis guaranteed to be an array. TheArray.isArray(authors)check is unnecessary.♻️ Suggested simplification
- if (!Array.isArray(authors) || authors.length === 0) { + if (authors.length === 0) { return { authors: [] } }
97-100: Cache key may vary with JSON formatting differences.Using the raw
authorsquery string as the cache key means semantically identical requests with different JSON formatting (e.g., spacing, key order) will create separate cache entries. Consider normalising the key.♻️ Suggested improvement
getKey: event => { const { authors } = getQuery(event) - return `author-profiles:${authors}` + // Normalise by parsing and re-stringifying with sorted keys + try { + const parsed = JSON.parse(authors as string) + return `author-profiles:${JSON.stringify(parsed)}` + } catch { + return `author-profiles:${authors}` + } },server/api/atproto/bluesky-comments.get.ts (3)
80-85: Array access without bounds check.Line 83 accesses
postsResponse.posts?.[0]which could be undefined if the array is empty. While optional chaining handles this, the coding guidelines recommend explicit checks when accessing array elements by index for clarity.♻️ Suggested explicit check
+ const postMetadata = postsResponse.posts.length > 0 ? postsResponse.posts[0] : null + return { thread, likes: likesResponse.likes || [], - totalLikes: postsResponse.posts?.[0]?.likeCount || thread?.likeCount || 0, + totalLikes: postMetadata?.likeCount ?? thread?.likeCount ?? 0, postUrl: atUriToWebUrl(uri), }
109-114: Destructuring match result without null check on capture groups.While the regex guarantees capture groups exist when a match occurs, TypeScript cannot infer this. The destructured
didandrkeycould be typed asstring | undefined. Consider adding explicit checks or using non-null assertion with a comment explaining the guarantee.♻️ Safer destructuring
function atUriToWebUrl(uri: string): string | null { const match = uri.match(AT_URI_REGEX) if (!match) return null - const [, did, rkey] = match + const did = match[1] + const rkey = match[2] + if (!did || !rkey) return null // Defensive check for type safety return `https://bsky.app/profile/${did}/post/${rkey}` }
101-104: Cache key computed before URI validation.The
getKeyfunction extracts the URI directly from query params without validation. If an invalid URI is provided, it will still be used as the cache key before the handler throws a 400 error, potentially polluting the cache namespace with invalid keys.♻️ Suggested improvement
getKey: event => { const { uri } = getQuery(event) - return `bluesky:${uri}` + // Only cache valid URIs - invalid ones won't be stored anyway due to error response + return typeof uri === 'string' ? `bluesky:${uri}` : 'bluesky:invalid' },Alternatively, consider validating in
getKeyto return a consistent key for invalid inputs, preventing cache key pollution.
| else if (value.startsWith('[') && value.endsWith(']')) { | ||
| acc[key] = value | ||
| .slice(1, -1) | ||
| .split(',') | ||
| .map(s => s.trim().replace(/^["']|["']$/g, '')) | ||
| } else acc[key] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty array produces a single-element array with empty string.
When parsing [], value.slice(1, -1) yields '', and ''.split(',') produces [''] rather than []. This may cause downstream issues when consumers expect an empty array.
🛠️ Proposed fix
else if (value.startsWith('[') && value.endsWith(']')) {
- acc[key] = value
- .slice(1, -1)
- .split(',')
- .map(s => s.trim().replace(/^["']|["']$/g, ''))
+ const inner = value.slice(1, -1).trim()
+ acc[key] = inner
+ ? inner.split(',').map(s => s.trim().replace(/^["']|["']$/g, ''))
+ : []
} else acc[key] = valueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kai-ros is this comment valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
package.json (1)
42-42: Inconsistent version pinning for@atproto/common.This dependency is pinned to an exact version (
0.5.10) whilst other@atproto/*packages (lines 41, 44) use caret ranges (^). This could lead to version mismatches between related packages. Consider using^0.5.10for consistency, or document why exact pinning is required here.💡 Suggested change
- "@atproto/common": "0.5.10", + "@atproto/common": "^0.5.10",modules/standard-site-sync.ts (3)
14-14: Consider renaming for clarity.The constant converts milliseconds (from
Date.getTime()) to microseconds.MILLISECONDS_TO_MICROSECONDSwould more accurately describe its purpose.
46-48: Inconsistent error logging format.Line 47 uses string concatenation whilst line 65 uses a comma-separated approach. For consistency and better stack trace formatting, consider using the comma-separated style throughout.
💡 Suggested change
batch.map(file => - syncFile(file, NPMX_SITE, client).catch(error => - console.error(`[standard-site-sync] Error in ${file}:` + error), + syncFile(file, NPMX_SITE, client).catch(err => + console.error(`[standard-site-sync] Error in ${file}:`, err), ), ),
123-131: Minor inconsistency:result.output.datevsdata.date.Line 123 uses
result.output.datewhilst elsewhere the validated output is accessed viadata. Both reference the same value, but usingdata.datewould be more consistent.💡 Suggested change
- const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS + const dateInMicroSeconds = new Date(data.date).getTime() * DATE_TO_MICROSECONDS
| const document = site.standard.document.$build({ | ||
| site: siteUrl as `${string}:${string}`, | ||
| path: data.path, | ||
| title: data.title, | ||
| description: data.description ?? data.excerpt, | ||
| tags: data.tags, | ||
| // This can be extended to update the site.standard.document .updatedAt if it is changed and use the posts date here | ||
| publishedAt: new Date(data.date).toISOString(), | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n 'NPMX_SITE' --type ts -A 2 -B 2Repository: npmx-dev/npmx.dev
Length of output: 1595
🏁 Script executed:
cd modules && head -c 5000 standard-site-sync.ts | cat -nRepository: npmx-dev/npmx.dev
Length of output: 5648
🏁 Script executed:
rg -n 'siteUrl' modules/standard-site-sync.ts -B 3 -A 3Repository: npmx-dev/npmx.dev
Length of output: 599
🏁 Script executed:
fd -t ts -e ts . -x grep -l 'NPMX_SITE' {}Repository: npmx-dev/npmx.dev
Length of output: 230
Improve type safety for the siteUrl parameter.
The type assertion siteUrl as \${string}:${string}`assumes the URL contains a colon, but the parameter typesiteUrl: stringprovides no enforcement. Whilst NPMX_SITE is hardcoded to'https://npmx.dev'`, the parameter accepts any string, creating potential for runtime issues if the function is called with an improperly formatted URL.
Consider either constraining the parameter type using a template literal type (${string}:${string}) or adding runtime validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kai-ros is this comment valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
server/api/atproto/author-profiles.get.ts (1)
44-46: Redundant array check.The
!Array.isArray(authors)check is unnecessary sincev.parse(v.array(AuthorSchema), parsed)on line 30 already guaranteesauthorsis an array — if it weren't, the valibot validation would have thrown.♻️ Proposed simplification
- if (!Array.isArray(authors) || authors.length === 0) { + if (authors.length === 0) { return { authors: [] } }
fatfingers23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! I think the login part can come along later. I think there may need to be a check as well before pushing. But I don't think it should accept getting the pr in since this is the start of it
| // Clock id(3) needs to be the same everytime to get the same TID from a timestamp | ||
| const tid = TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE) | ||
|
|
||
| // client.put is async and needs to be awaited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need to make sure the client is logged into as well. Also want to only login when we know we're going do writes so we don't hit session rate limits
https://npmx.dev/package/@atproto/lex#user-content-authenticated-client-with-password
This pulls apart the feature work done on this PR npmx-dev#257. In this commit, we're adding the atproto server side blog apis that will fetch bluesky comments and author profiles. This will be used in a later PR that will implement the frontend blog functionality. Co-authored-by: Brandon Hurrington <[email protected]>
For now this will be skipped since there's no md files and NPMX_PDS_URL isn't set. Co-authored-by: Brandon Hurrington <[email protected]>
e8aedfe to
038cbc3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (4)
server/api/atproto/author-profiles.get.ts (1)
44-46: RedundantArray.isArraycheck.After
v.parse(v.array(AuthorSchema), parsed)succeeds on line 30,authorsis guaranteed to be an array. TheArray.isArray(authors)check is unnecessary; only the length check is meaningful.♻️ Proposed simplification
- if (!Array.isArray(authors) || authors.length === 0) { + if (authors.length === 0) { return { authors: [] } }modules/standard-site-sync.ts (2)
13-14: Misleading constant nameDATE_TO_MICROSECONDS.The constant
DATE_TO_MICROSECONDS = 1000is used to convert milliseconds to microseconds (line 123), but the name suggests it represents a duration rather than a conversion factor. Consider renaming for clarity.♻️ Proposed rename
-const DATE_TO_MICROSECONDS = 1000 +const MILLISECONDS_TO_MICROSECONDS = 1000And on line 123:
- const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS + const dateInMicroseconds = new Date(result.output.date).getTime() * MILLISECONDS_TO_MICROSECONDS
78-80: Consider using async file reading.
readFileSyncblocks the event loop. While acceptable for build-time operations, usingfs/promises.readFilewould be more idiomatic in an async context and could improve parallelism within batches.♻️ Proposed change
-import { readFileSync } from 'node:fs' +import { readFile } from 'node:fs/promises' // In syncFile function: - const fileContent = readFileSync(filePath, 'utf-8') + const fileContent = await readFile(filePath, 'utf-8')server/api/atproto/bluesky-comments.get.ts (1)
109-114: Add explicit null checks for regex capture groups.While the regex pattern guarantees capture groups exist when matched, TypeScript types them as
string | undefined. Adding explicit checks improves type safety and guards against future regex changes.🛡️ Proposed improvement
function atUriToWebUrl(uri: string): string | null { const match = uri.match(AT_URI_REGEX) if (!match) return null - const [, did, rkey] = match + const did = match[1] + const rkey = match[2] + if (!did || !rkey) return null return `https://bsky.app/profile/${did}/post/${rkey}` }As per coding guidelines: "Ensure you write strictly type-safe code, for example by ensuring you always check when accessing an array value by index".
| syncFile(file, NPMX_SITE, client).catch(error => | ||
| console.error(`[standard-site-sync] Error in ${file}:` + error), | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error logging uses string concatenation with error object.
Using + error concatenates the error object as a string, which may lose stack trace information. Use a comma or structured logging for better error visibility.
🔧 Proposed fix
syncFile(file, NPMX_SITE, client).catch(error =>
- console.error(`[standard-site-sync] Error in ${file}:` + error),
+ console.error(`[standard-site-sync] Error in ${file}:`, error),
),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| syncFile(file, NPMX_SITE, client).catch(error => | |
| console.error(`[standard-site-sync] Error in ${file}:` + error), | |
| ), | |
| syncFile(file, NPMX_SITE, client).catch(error => | |
| console.error(`[standard-site-sync] Error in ${file}:`, error), | |
| ), |
| likes: likesResponse.likes || [], | ||
| totalLikes: postsResponse.posts?.[0]?.likeCount || thread?.likeCount || 0, | ||
| postUrl: atUriToWebUrl(uri), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use nullish coalescing (??) for likeCount fallback.
Using || will treat likeCount: 0 as falsy and fall back to the next value. Since 0 is a valid like count, use ?? to only fall back on null/undefined.
🔧 Proposed fix
- totalLikes: postsResponse.posts?.[0]?.likeCount || thread?.likeCount || 0,
+ totalLikes: postsResponse.posts?.[0]?.likeCount ?? thread?.likeCount ?? 0,As per coding guidelines: "Ensure you write strictly type-safe code".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| likes: likesResponse.likes || [], | |
| totalLikes: postsResponse.posts?.[0]?.likeCount || thread?.likeCount || 0, | |
| postUrl: atUriToWebUrl(uri), | |
| likes: likesResponse.likes || [], | |
| totalLikes: postsResponse.posts?.[0]?.likeCount ?? thread?.likeCount ?? 0, | |
| postUrl: atUriToWebUrl(uri), |
| getKey: event => { | ||
| const { uri } = getQuery(event) | ||
| return `bluesky:${uri}` | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache key may resolve to "bluesky:undefined" for invalid requests.
If the uri query parameter is missing, getKey returns "bluesky:undefined", which could cause cache collisions for all invalid requests. Consider providing a fallback.
🔧 Proposed fix
getKey: event => {
const { uri } = getQuery(event)
- return `bluesky:${uri}`
+ return `bluesky:${uri ?? ''}`
},| // Regex | ||
| export const AT_URI_REGEX = /^at:\/\/(did:plc:[a-z0-9]+)\/app\.bsky\.feed\.post\/([a-z0-9]+)$/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
AT Protocol TID rkey base32 character set valid characters
💡 Result:
TID (Timestamp Identifier) base32 character set (AT Protocol)
- Alphabet (“base32-sortable”):
234567abcdefghijklmnopqrstuvwxyz(lowercase only) [1] - Length: exactly 13 ASCII characters [1]
- First character restriction: must be one of
234567abcdefghij[1] - Reference regex:
^[234567abcdefghij][234567abcdefghijklmnopqrstuvwxyz]{12}$[1]
rkey (Record Key) valid characters (overall constraint)
Regardless of rkey “type”, an rkey must be 1–512 chars and use only this ASCII subset:
- Allowed:
A-Za-z0-9plus. - _ : ~(i.e.,.-_:~) [2] - Not allowed:
.or..[2]
If your rkey is specifically the tid type, then it must follow the TID rules/alphabet above [2][1].
Sources: [1] AT Protocol TID spec, [2] AT Protocol Record Key spec.
🌐 Web query:
AT Protocol did:web support Bluesky identifiers
💡 Result:
Yes. AT Protocol supports did:web as a “blessed” DID method, alongside did:plc, so did:web identifiers are valid ATProto/Bluesky identifiers where DIDs are used (accounts and services). ATProto’s did:web support is restricted to hostname-level DIDs (no path-based did:web DIDs for accounts). [1], [2]
Bluesky itself uses did:web in production for some service DIDs (for example did:web:api.bsky.app#bsky_appview and did:web:api.bsky.chat#bsky_chat). [3]
Sources:
[1] AT Protocol DID spec (atproto.com/specs/did) (atproto.com)
[2] AT Protocol DIDs overview (bluesky-jp.github.io/specs/did) (bluesky-jp.github.io)
[3] Bluesky API directory / service DIDs (docs.bsky.app) (docs.bsky.app)
Citations:
- 1: https://atproto.com/specs/did?utm_source=openai
- 2: https://bluesky-jp.github.io/specs/did?utm_source=openai
- 3: https://docs.bsky.app/docs/advanced-guides/api-directory?utm_source=openai
AT_URI_REGEX may be overly restrictive.
The regex only matches did:plc: identifiers, but AT Protocol also supports did:web: DIDs—both are used in Bluesky (e.g., service DIDs like did:web:api.bsky.app). Additionally, the rkey pattern [a-z0-9]+ does not accurately represent the TID specification, which requires the first character to be from 234567abcdefghij and uses exactly 13 characters. More broadly, AT Protocol allows rkeys with the characters A-Za-z0-9.-_:~, so the pattern is unnecessarily restrictive.
This could cause valid AT URIs to fail validation, depending on your use case.
This is a simple use case so it should be a drop in replacement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
| const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS | ||
|
|
||
| // Clock id(3) needs to be the same everytime to get the same TID from a timestamp | ||
| const tid = TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE) | ||
|
|
||
| // client.put is async and needs to be awaited | ||
| await client.put(site.standard.document, document, { | ||
| rkey: tid.str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, locate and inspect the file in question
git ls-files | grep -E "standard-site-sync\.ts"Repository: npmx-dev/npmx.dev
Length of output: 90
🏁 Script executed:
# Get the full context around lines 123-130
if [ -f "modules/standard-site-sync.ts" ]; then
wc -l modules/standard-site-sync.ts
echo "=== Lines 110-145 ==="
sed -n '110,145p' modules/standard-site-sync.ts
fiRepository: npmx-dev/npmx.dev
Length of output: 933
🏁 Script executed:
# Search for TID implementation
rg -t ts "class TID|function TID|fromTime" --context 3Repository: npmx-dev/npmx.dev
Length of output: 615
🏁 Script executed:
# Check for imports and CLOCK_ID_THREE definition
rg -t ts "CLOCK_ID_THREE|DATE_TO_MICROSECONDS" --context 2Repository: npmx-dev/npmx.dev
Length of output: 963
🌐 Web query:
AT Protocol rkey uniqueness requirements TID collision same timestamp
💡 Result:
- Uniqueness scope for
rkey: Record keys are only required to be unique within a single repo + collection. In other words, (did, collection, rkey) is the unique tuple; (did, rkey) is not unique, and the samerkey(including the same TID string) may be reused in different collections in the same repo.rkeyvalues are ultimately user-controlled strings, so you must not assume any global uniqueness. [1] - What a TID is (and why “same timestamp” can still differ): A TID is a 64-bit value where 53 bits are “microseconds since UNIX epoch” and the low 10 bits are a random “clock identifier”. That means multiple TIDs can share the same microsecond timestamp while still being different due to the clock-id bits. [2]
- Collision expectations / requirements: atproto explicitly says global uniqueness of TIDs cannot be guaranteed (decentralized + potentially adversarial). However, TID generators should ensure their own output is monotonically increasing and never repeats, even when generating multiple IDs in the same microsecond or during clock adjustments (“clock smear” / sync incidents). A common strategy is “last_ts = max(now_us, last_ts+1)”. [2] [3]
- If two records end up with the same TID
rkey: within the same repo+collection that would be anrkeyconflict because that tuple must be unique; the practical fix is to regenerate a new TID (or otherwise choose a differentrkey) and retry. [1] [2]
Sources:
[1] AT Protocol — Record Keys (rkey), incl. (did, collection, rkey) uniqueness and non-global TID uniqueness notes: (atproto.com)
[2] AT Protocol — TID spec (53-bit microseconds + 10-bit clock id; monotonic/non-repeating guidance; no global guarantee): (atproto.com)
[3] Example implementation enforcing monotonic microsecond ticks (max(now_us, last+1)): (arroba.readthedocs.io)
Citations:
- 1: https://atproto.com/specs/record-key?utm_source=openai
- 2: https://atproto.com/specs/tid?utm_source=openai
- 3: https://arroba.readthedocs.io/en/v0.7/_modules/arroba/util.html?utm_source=openai
🌐 Web query:
AT Protocol lexicon rkey format constraints and TID fromTime stability
💡 Result:
Lexicon rkey / record-key format constraints
In Lexicon, record keys are strings with format record-key, and each record schema declares a key type (tid, nsid, literal:<value>, or any). [1]
Baseline record-key syntax (applies regardless of key type): [1]
- Length: 1–512 characters
- Allowed characters (ASCII subset): alphanumeric (
A–Z a–z 0–9) plus. - _ : ~ - Forbidden exact values:
.and.. - Case-sensitive
TID (Timestamp Identifier) format constraints
A TID is a 13-character string using the base32-sortable alphabet 234567abcdefghijklmnopqrstuvwxyz. [2]
Structure / semantics: [2]
- Encodes a 64-bit integer (big-endian)
- Top bit is 0
- Next 53 bits: microseconds since UNIX epoch
- Last 10 bits: random “clock identifier”
- Designed to be lexicographically sortable
Syntax / parsing rules: [2]
- Length always 13 ASCII chars
- First char must be one of
234567abcdefghij - Reference regex:
^[234567abcdefghij][234567abcdefghijklmnopqrstuvwxyz]{12}$[2]
TID.fromTime(...) stability (what you can and can’t assume)
- The spec guarantees that a TID contains a timestamp (µs) plus a 10-bit clock id field. Because that clock id is not determined by time alone (it’s intended to be random for generators), there is no spec-level guarantee that “time → TID” is unique or stable unless the clock id is fixed by the method. [2]
- Therefore, stability of
fromTimeis an implementation detail: it’s typically used to create a sortable boundary (e.g., the lowest or highest possible TID for a given timestamp by choosing a fixed clock id like0or1023), not a globally unique ID. This follows from the TID bit layout in the spec, but the exact choice (clock id, rounding to microseconds, etc.) depends on the library. [2]
Practical implication: if you need a deterministic cursor/boundary, fromTime can be stable only if the implementation sets the clock id deterministically and normalizes the timestamp deterministically; the TID spec itself doesn’t require any particular fromTime behavior. [2]
Sources
[1] AT Protocol — Record Key spec: record-key types + baseline syntax constraints
[2] AT Protocol — TID spec: base32-sortable encoding, 53-bit timestamp + 10-bit clock id, regex/syntax rules
[3] AT Protocol — Lexicon spec: string formats include record-key and tid (context for schema validation) (atproto.com)
Citations:
Genuine collision risk: fixed clock id causes same-day posts to overwrite each other.
Posts sharing the same date will generate identical rkeys because TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE) produces the same TID when the timestamp is identical and the clock id is constant. Within a single repo + collection, the tuple (did, collection, rkey) must be unique; two records with the same rkey will cause the second to overwrite the first.
Derive the clock id deterministically from data.path (which is unique per post) to ensure distinct TIDs:
Example fix
const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS
+ const clockId =
+ Number.parseInt(createHash('sha256').update(data.path).digest('hex').slice(0, 2), 16) % 1024
- const tid = TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE)
+ const tid = TID.fromTime(dateInMicroSeconds, clockId)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS | |
| // Clock id(3) needs to be the same everytime to get the same TID from a timestamp | |
| const tid = TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE) | |
| // client.put is async and needs to be awaited | |
| await client.put(site.standard.document, document, { | |
| rkey: tid.str, | |
| const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS | |
| const clockId = | |
| Number.parseInt(createHash('sha256').update(data.path).digest('hex').slice(0, 2), 16) % 1024 | |
| const tid = TID.fromTime(dateInMicroSeconds, clockId) | |
| // client.put is async and needs to be awaited | |
| await client.put(site.standard.document, document, { | |
| rkey: tid.str, |
| @@ -1,4 +1,4 @@ | |||
| import * as dev from '#shared/types/lexicons/dev' | |||
| import * as dev from '../types/lexicons/dev' | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fatfingers23 I had to update this because it was failing in CI for me https://github.com/npmx-dev/npmx.dev/actions/runs/21649231813/job/62409192743?pr=841
Error: shared/utils/constants.ts(1,22): error TS2307: Cannot find module '#shared/types/lexicons/dev' or its corresponding type declarations.
Let me know if this change is okay!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I think #shared is the correct way and I had it wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, I wonder why #shared for me is failing 🤔
This pulls apart the feature work done on this PR #257.
In this commit, we're adding the atproto server side blog apis that will fetch bluesky comments and author profiles. This
will be used in a later PR that will implement the frontend blog functionality.