Skip to content

Conversation

@jonathanyeong
Copy link
Contributor

@jonathanyeong jonathanyeong commented Feb 3, 2026

This pulls apart the feature work done on this PR #257.

In this commit, we're adding the atproto server side blog apis that will fetch bluesky comments and author profiles. This
will be used in a later PR that will implement the frontend blog functionality.

@vercel
Copy link

vercel bot commented Feb 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
npmx.dev Ready Ready Preview, Comment Feb 3, 2026 9:57pm
2 Skipped Deployments
Project Deployment Actions Updated (UTC)
docs.npmx.dev Ignored Ignored Preview Feb 3, 2026 9:57pm
npmx-lunaria Ignored Ignored Feb 3, 2026 9:57pm

Request Review

@jonathanyeong jonathanyeong force-pushed the feat/atproto-blog-server-apis branch from 990ecfb to 9af4c33 Compare February 3, 2026 12:59
@jonathanyeong jonathanyeong marked this pull request as ready for review February 3, 2026 12:59
@codecov
Copy link

codecov bot commented Feb 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

📝 Walkthrough

Walkthrough

Adds Bluesky / AT-protocol integration and blog tooling. Introduces two cached server API endpoints: author-profiles (resolves author avatars and profile URLs) and bluesky-comments (fetches and normalises Bluesky post threads, likes and embeds). Adds Valibot schemas (BlueSkyUriSchema, AuthorSchema, BlogPostSchema), TypeScript types for comments and embeds, shared constants including AT_URI_REGEX, BLUESKY_API and BLUESKY_COMMENTS_REQUEST, a basic frontmatter parser, a Nuxt module standard-site-sync to sync blog posts to an AT-protocol site, and package.json dependency and script order updates.

Possibly related PRs

  • feat: package likes #712: Modifies Atproto integration and updates shared/utils/constants.ts, adding complementary AT-protocol-related constants.

Suggested reviewers

  • danielroe
🚥 Pre-merge checks | ✅ 1
✅ Passed checks (1 passed)
Check name Status Explanation
Description check ✅ Passed The pull request description accurately describes the changeset as adding server-side AT Protocol blog APIs for fetching Bluesky comments and author profiles.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
server/api/atproto/author-profiles.get.ts (2)

44-46: Redundant array check after Valibot parse.

After v.parse(v.array(AuthorSchema), parsed) succeeds, authors is guaranteed to be an array. The Array.isArray(authors) check is unnecessary.

♻️ Suggested simplification
-    if (!Array.isArray(authors) || authors.length === 0) {
+    if (authors.length === 0) {
       return { authors: [] }
     }

97-100: Cache key may vary with JSON formatting differences.

Using the raw authors query string as the cache key means semantically identical requests with different JSON formatting (e.g., spacing, key order) will create separate cache entries. Consider normalising the key.

♻️ Suggested improvement
    getKey: event => {
      const { authors } = getQuery(event)
-     return `author-profiles:${authors}`
+     // Normalise by parsing and re-stringifying with sorted keys
+     try {
+       const parsed = JSON.parse(authors as string)
+       return `author-profiles:${JSON.stringify(parsed)}`
+     } catch {
+       return `author-profiles:${authors}`
+     }
    },
server/api/atproto/bluesky-comments.get.ts (3)

80-85: Array access without bounds check.

Line 83 accesses postsResponse.posts?.[0] which could be undefined if the array is empty. While optional chaining handles this, the coding guidelines recommend explicit checks when accessing array elements by index for clarity.

♻️ Suggested explicit check
+     const postMetadata = postsResponse.posts.length > 0 ? postsResponse.posts[0] : null
+
      return {
        thread,
        likes: likesResponse.likes || [],
-       totalLikes: postsResponse.posts?.[0]?.likeCount || thread?.likeCount || 0,
+       totalLikes: postMetadata?.likeCount ?? thread?.likeCount ?? 0,
        postUrl: atUriToWebUrl(uri),
      }

109-114: Destructuring match result without null check on capture groups.

While the regex guarantees capture groups exist when a match occurs, TypeScript cannot infer this. The destructured did and rkey could be typed as string | undefined. Consider adding explicit checks or using non-null assertion with a comment explaining the guarantee.

♻️ Safer destructuring
 function atUriToWebUrl(uri: string): string | null {
   const match = uri.match(AT_URI_REGEX)
   if (!match) return null
-  const [, did, rkey] = match
+  const did = match[1]
+  const rkey = match[2]
+  if (!did || !rkey) return null // Defensive check for type safety
   return `https://bsky.app/profile/${did}/post/${rkey}`
 }

101-104: Cache key computed before URI validation.

The getKey function extracts the URI directly from query params without validation. If an invalid URI is provided, it will still be used as the cache key before the handler throws a 400 error, potentially polluting the cache namespace with invalid keys.

♻️ Suggested improvement
    getKey: event => {
      const { uri } = getQuery(event)
-     return `bluesky:${uri}`
+     // Only cache valid URIs - invalid ones won't be stored anyway due to error response
+     return typeof uri === 'string' ? `bluesky:${uri}` : 'bluesky:invalid'
    },

Alternatively, consider validating in getKey to return a consistent key for invalid inputs, preventing cache key pollution.

Comment on lines +23 to +28
else if (value.startsWith('[') && value.endsWith(']')) {
acc[key] = value
.slice(1, -1)
.split(',')
.map(s => s.trim().replace(/^["']|["']$/g, ''))
} else acc[key] = value
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Empty array produces a single-element array with empty string.

When parsing [], value.slice(1, -1) yields '', and ''.split(',') produces [''] rather than []. This may cause downstream issues when consumers expect an empty array.

🛠️ Proposed fix
      else if (value.startsWith('[') && value.endsWith(']')) {
-       acc[key] = value
-         .slice(1, -1)
-         .split(',')
-         .map(s => s.trim().replace(/^["']|["']$/g, ''))
+       const inner = value.slice(1, -1).trim()
+       acc[key] = inner
+         ? inner.split(',').map(s => s.trim().replace(/^["']|["']$/g, ''))
+         : []
      } else acc[key] = value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kai-ros is this comment valid?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
package.json (1)

42-42: Inconsistent version pinning for @atproto/common.

This dependency is pinned to an exact version (0.5.10) whilst other @atproto/* packages (lines 41, 44) use caret ranges (^). This could lead to version mismatches between related packages. Consider using ^0.5.10 for consistency, or document why exact pinning is required here.

💡 Suggested change
-    "@atproto/common": "0.5.10",
+    "@atproto/common": "^0.5.10",
modules/standard-site-sync.ts (3)

14-14: Consider renaming for clarity.

The constant converts milliseconds (from Date.getTime()) to microseconds. MILLISECONDS_TO_MICROSECONDS would more accurately describe its purpose.


46-48: Inconsistent error logging format.

Line 47 uses string concatenation whilst line 65 uses a comma-separated approach. For consistency and better stack trace formatting, consider using the comma-separated style throughout.

💡 Suggested change
           batch.map(file =>
-            syncFile(file, NPMX_SITE, client).catch(error =>
-              console.error(`[standard-site-sync] Error in ${file}:` + error),
+            syncFile(file, NPMX_SITE, client).catch(err =>
+              console.error(`[standard-site-sync] Error in ${file}:`, err),
             ),
           ),

123-131: Minor inconsistency: result.output.date vs data.date.

Line 123 uses result.output.date whilst elsewhere the validated output is accessed via data. Both reference the same value, but using data.date would be more consistent.

💡 Suggested change
-  const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS
+  const dateInMicroSeconds = new Date(data.date).getTime() * DATE_TO_MICROSECONDS

Comment on lines +113 to +121
const document = site.standard.document.$build({
site: siteUrl as `${string}:${string}`,
path: data.path,
title: data.title,
description: data.description ?? data.excerpt,
tags: data.tags,
// This can be extended to update the site.standard.document .updatedAt if it is changed and use the posts date here
publishedAt: new Date(data.date).toISOString(),
})
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n 'NPMX_SITE' --type ts -A 2 -B 2

Repository: npmx-dev/npmx.dev

Length of output: 1595


🏁 Script executed:

cd modules && head -c 5000 standard-site-sync.ts | cat -n

Repository: npmx-dev/npmx.dev

Length of output: 5648


🏁 Script executed:

rg -n 'siteUrl' modules/standard-site-sync.ts -B 3 -A 3

Repository: npmx-dev/npmx.dev

Length of output: 599


🏁 Script executed:

fd -t ts -e ts . -x grep -l 'NPMX_SITE' {}

Repository: npmx-dev/npmx.dev

Length of output: 230


Improve type safety for the siteUrl parameter.

The type assertion siteUrl as \${string}:${string}`assumes the URL contains a colon, but the parameter typesiteUrl: stringprovides no enforcement. Whilst NPMX_SITE is hardcoded to'https://npmx.dev'`, the parameter accepts any string, creating potential for runtime issues if the function is called with an improperly formatted URL.

Consider either constraining the parameter type using a template literal type (${string}:${string}) or adding runtime validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kai-ros is this comment valid?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
server/api/atproto/author-profiles.get.ts (1)

44-46: Redundant array check.

The !Array.isArray(authors) check is unnecessary since v.parse(v.array(AuthorSchema), parsed) on line 30 already guarantees authors is an array — if it weren't, the valibot validation would have thrown.

♻️ Proposed simplification
-    if (!Array.isArray(authors) || authors.length === 0) {
+    if (authors.length === 0) {
       return { authors: [] }
     }

Copy link
Contributor

@fatfingers23 fatfingers23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! I think the login part can come along later. I think there may need to be a check as well before pushing. But I don't think it should accept getting the pr in since this is the start of it

// Clock id(3) needs to be the same everytime to get the same TID from a timestamp
const tid = TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE)

// client.put is async and needs to be awaited
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to make sure the client is logged into as well. Also want to only login when we know we're going do writes so we don't hit session rate limits

https://npmx.dev/package/@atproto/lex#user-content-authenticated-client-with-password

jonathanyeong and others added 6 commits February 3, 2026 16:08
This pulls apart the feature work done on this PR npmx-dev#257.

In this commit, we're adding the atproto server side blog apis that will fetch bluesky comments
and author profiles. This will be used in a later PR that will implement the frontend blog functionality.

Co-authored-by: Brandon Hurrington <[email protected]>
For now this will be skipped since there's no md files and NPMX_PDS_URL isn't set.

Co-authored-by: Brandon Hurrington <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
server/api/atproto/author-profiles.get.ts (1)

44-46: Redundant Array.isArray check.

After v.parse(v.array(AuthorSchema), parsed) succeeds on line 30, authors is guaranteed to be an array. The Array.isArray(authors) check is unnecessary; only the length check is meaningful.

♻️ Proposed simplification
-    if (!Array.isArray(authors) || authors.length === 0) {
+    if (authors.length === 0) {
       return { authors: [] }
     }
modules/standard-site-sync.ts (2)

13-14: Misleading constant name DATE_TO_MICROSECONDS.

The constant DATE_TO_MICROSECONDS = 1000 is used to convert milliseconds to microseconds (line 123), but the name suggests it represents a duration rather than a conversion factor. Consider renaming for clarity.

♻️ Proposed rename
-const DATE_TO_MICROSECONDS = 1000
+const MILLISECONDS_TO_MICROSECONDS = 1000

And on line 123:

-  const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS
+  const dateInMicroseconds = new Date(result.output.date).getTime() * MILLISECONDS_TO_MICROSECONDS

78-80: Consider using async file reading.

readFileSync blocks the event loop. While acceptable for build-time operations, using fs/promises.readFile would be more idiomatic in an async context and could improve parallelism within batches.

♻️ Proposed change
-import { readFileSync } from 'node:fs'
+import { readFile } from 'node:fs/promises'

// In syncFile function:
-  const fileContent = readFileSync(filePath, 'utf-8')
+  const fileContent = await readFile(filePath, 'utf-8')
server/api/atproto/bluesky-comments.get.ts (1)

109-114: Add explicit null checks for regex capture groups.

While the regex pattern guarantees capture groups exist when matched, TypeScript types them as string | undefined. Adding explicit checks improves type safety and guards against future regex changes.

🛡️ Proposed improvement
 function atUriToWebUrl(uri: string): string | null {
   const match = uri.match(AT_URI_REGEX)
   if (!match) return null
-  const [, did, rkey] = match
+  const did = match[1]
+  const rkey = match[2]
+  if (!did || !rkey) return null
   return `https://bsky.app/profile/${did}/post/${rkey}`
 }

As per coding guidelines: "Ensure you write strictly type-safe code, for example by ensuring you always check when accessing an array value by index".

Comment on lines +46 to +48
syncFile(file, NPMX_SITE, client).catch(error =>
console.error(`[standard-site-sync] Error in ${file}:` + error),
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Error logging uses string concatenation with error object.

Using + error concatenates the error object as a string, which may lose stack trace information. Use a comma or structured logging for better error visibility.

🔧 Proposed fix
           syncFile(file, NPMX_SITE, client).catch(error =>
-              console.error(`[standard-site-sync] Error in ${file}:` + error),
+              console.error(`[standard-site-sync] Error in ${file}:`, error),
           ),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
syncFile(file, NPMX_SITE, client).catch(error =>
console.error(`[standard-site-sync] Error in ${file}:` + error),
),
syncFile(file, NPMX_SITE, client).catch(error =>
console.error(`[standard-site-sync] Error in ${file}:`, error),
),

Comment on lines 82 to 84
likes: likesResponse.likes || [],
totalLikes: postsResponse.posts?.[0]?.likeCount || thread?.likeCount || 0,
postUrl: atUriToWebUrl(uri),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use nullish coalescing (??) for likeCount fallback.

Using || will treat likeCount: 0 as falsy and fall back to the next value. Since 0 is a valid like count, use ?? to only fall back on null/undefined.

🔧 Proposed fix
-        totalLikes: postsResponse.posts?.[0]?.likeCount || thread?.likeCount || 0,
+        totalLikes: postsResponse.posts?.[0]?.likeCount ?? thread?.likeCount ?? 0,

As per coding guidelines: "Ensure you write strictly type-safe code".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
likes: likesResponse.likes || [],
totalLikes: postsResponse.posts?.[0]?.likeCount || thread?.likeCount || 0,
postUrl: atUriToWebUrl(uri),
likes: likesResponse.likes || [],
totalLikes: postsResponse.posts?.[0]?.likeCount ?? thread?.likeCount ?? 0,
postUrl: atUriToWebUrl(uri),

Comment on lines +101 to +104
getKey: event => {
const { uri } = getQuery(event)
return `bluesky:${uri}`
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Cache key may resolve to "bluesky:undefined" for invalid requests.

If the uri query parameter is missing, getKey returns "bluesky:undefined", which could cause cache collisions for all invalid requests. Consider providing a fallback.

🔧 Proposed fix
     getKey: event => {
       const { uri } = getQuery(event)
-      return `bluesky:${uri}`
+      return `bluesky:${uri ?? ''}`
     },

Comment on lines +64 to +65
// Regex
export const AT_URI_REGEX = /^at:\/\/(did:plc:[a-z0-9]+)\/app\.bsky\.feed\.post\/([a-z0-9]+)$/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

AT Protocol TID rkey base32 character set valid characters

💡 Result:

TID (Timestamp Identifier) base32 character set (AT Protocol)

  • Alphabet (“base32-sortable”): 234567abcdefghijklmnopqrstuvwxyz (lowercase only) [1]
  • Length: exactly 13 ASCII characters [1]
  • First character restriction: must be one of 234567abcdefghij [1]
  • Reference regex: ^[234567abcdefghij][234567abcdefghijklmnopqrstuvwxyz]{12}$ [1]

rkey (Record Key) valid characters (overall constraint)

Regardless of rkey “type”, an rkey must be 1–512 chars and use only this ASCII subset:

  • Allowed: A-Za-z0-9 plus . - _ : ~ (i.e., .-_:~) [2]
  • Not allowed: . or .. [2]

If your rkey is specifically the tid type, then it must follow the TID rules/alphabet above [2][1].

Sources: [1] AT Protocol TID spec, [2] AT Protocol Record Key spec.


🌐 Web query:

AT Protocol did:web support Bluesky identifiers

💡 Result:

Yes. AT Protocol supports did:web as a “blessed” DID method, alongside did:plc, so did:web identifiers are valid ATProto/Bluesky identifiers where DIDs are used (accounts and services). ATProto’s did:web support is restricted to hostname-level DIDs (no path-based did:web DIDs for accounts). [1], [2]

Bluesky itself uses did:web in production for some service DIDs (for example did:web:api.bsky.app#bsky_appview and did:web:api.bsky.chat#bsky_chat). [3]

Sources:
[1] AT Protocol DID spec (atproto.com/specs/did) (atproto.com)
[2] AT Protocol DIDs overview (bluesky-jp.github.io/specs/did) (bluesky-jp.github.io)
[3] Bluesky API directory / service DIDs (docs.bsky.app) (docs.bsky.app)

Citations:


AT_URI_REGEX may be overly restrictive.

The regex only matches did:plc: identifiers, but AT Protocol also supports did:web: DIDs—both are used in Bluesky (e.g., service DIDs like did:web:api.bsky.app). Additionally, the rkey pattern [a-z0-9]+ does not accurately represent the TID specification, which requires the first character to be from 234567abcdefghij and uses exactly 13 characters. More broadly, AT Protocol allows rkeys with the characters A-Za-z0-9.-_:~, so the pattern is unnecessarily restrictive.

This could cause valid AT URIs to fail validation, depending on your use case.

This is a simple use case so it should be a drop in replacement.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Comment on lines +123 to +130
const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS

// Clock id(3) needs to be the same everytime to get the same TID from a timestamp
const tid = TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE)

// client.put is async and needs to be awaited
await client.put(site.standard.document, document, {
rkey: tid.str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and inspect the file in question
git ls-files | grep -E "standard-site-sync\.ts"

Repository: npmx-dev/npmx.dev

Length of output: 90


🏁 Script executed:

# Get the full context around lines 123-130
if [ -f "modules/standard-site-sync.ts" ]; then
  wc -l modules/standard-site-sync.ts
  echo "=== Lines 110-145 ===" 
  sed -n '110,145p' modules/standard-site-sync.ts
fi

Repository: npmx-dev/npmx.dev

Length of output: 933


🏁 Script executed:

# Search for TID implementation
rg -t ts "class TID|function TID|fromTime" --context 3

Repository: npmx-dev/npmx.dev

Length of output: 615


🏁 Script executed:

# Check for imports and CLOCK_ID_THREE definition
rg -t ts "CLOCK_ID_THREE|DATE_TO_MICROSECONDS" --context 2

Repository: npmx-dev/npmx.dev

Length of output: 963


🌐 Web query:

AT Protocol rkey uniqueness requirements TID collision same timestamp

💡 Result:

  • Uniqueness scope for rkey: Record keys are only required to be unique within a single repo + collection. In other words, (did, collection, rkey) is the unique tuple; (did, rkey) is not unique, and the same rkey (including the same TID string) may be reused in different collections in the same repo. rkey values are ultimately user-controlled strings, so you must not assume any global uniqueness. [1]
  • What a TID is (and why “same timestamp” can still differ): A TID is a 64-bit value where 53 bits are “microseconds since UNIX epoch” and the low 10 bits are a random “clock identifier”. That means multiple TIDs can share the same microsecond timestamp while still being different due to the clock-id bits. [2]
  • Collision expectations / requirements: atproto explicitly says global uniqueness of TIDs cannot be guaranteed (decentralized + potentially adversarial). However, TID generators should ensure their own output is monotonically increasing and never repeats, even when generating multiple IDs in the same microsecond or during clock adjustments (“clock smear” / sync incidents). A common strategy is “last_ts = max(now_us, last_ts+1)”. [2] [3]
  • If two records end up with the same TID rkey: within the same repo+collection that would be an rkey conflict because that tuple must be unique; the practical fix is to regenerate a new TID (or otherwise choose a different rkey) and retry. [1] [2]

Sources:
[1] AT Protocol — Record Keys (rkey), incl. (did, collection, rkey) uniqueness and non-global TID uniqueness notes: (atproto.com)
[2] AT Protocol — TID spec (53-bit microseconds + 10-bit clock id; monotonic/non-repeating guidance; no global guarantee): (atproto.com)
[3] Example implementation enforcing monotonic microsecond ticks (max(now_us, last+1)): (arroba.readthedocs.io)

Citations:


🌐 Web query:

AT Protocol lexicon rkey format constraints and TID fromTime stability

💡 Result:

Lexicon rkey / record-key format constraints

In Lexicon, record keys are strings with format record-key, and each record schema declares a key type (tid, nsid, literal:<value>, or any). [1]

Baseline record-key syntax (applies regardless of key type): [1]

  • Length: 1–512 characters
  • Allowed characters (ASCII subset): alphanumeric (A–Z a–z 0–9) plus . - _ : ~
  • Forbidden exact values: . and ..
  • Case-sensitive

TID (Timestamp Identifier) format constraints

A TID is a 13-character string using the base32-sortable alphabet 234567abcdefghijklmnopqrstuvwxyz. [2]

Structure / semantics: [2]

  • Encodes a 64-bit integer (big-endian)
  • Top bit is 0
  • Next 53 bits: microseconds since UNIX epoch
  • Last 10 bits: random “clock identifier”
  • Designed to be lexicographically sortable

Syntax / parsing rules: [2]

  • Length always 13 ASCII chars
  • First char must be one of 234567abcdefghij
  • Reference regex: ^[234567abcdefghij][234567abcdefghijklmnopqrstuvwxyz]{12}$ [2]

TID.fromTime(...) stability (what you can and can’t assume)

  • The spec guarantees that a TID contains a timestamp (µs) plus a 10-bit clock id field. Because that clock id is not determined by time alone (it’s intended to be random for generators), there is no spec-level guarantee that “time → TID” is unique or stable unless the clock id is fixed by the method. [2]
  • Therefore, stability of fromTime is an implementation detail: it’s typically used to create a sortable boundary (e.g., the lowest or highest possible TID for a given timestamp by choosing a fixed clock id like 0 or 1023), not a globally unique ID. This follows from the TID bit layout in the spec, but the exact choice (clock id, rounding to microseconds, etc.) depends on the library. [2]

Practical implication: if you need a deterministic cursor/boundary, fromTime can be stable only if the implementation sets the clock id deterministically and normalizes the timestamp deterministically; the TID spec itself doesn’t require any particular fromTime behavior. [2]

Sources

[1] AT Protocol — Record Key spec: record-key types + baseline syntax constraints
[2] AT Protocol — TID spec: base32-sortable encoding, 53-bit timestamp + 10-bit clock id, regex/syntax rules
[3] AT Protocol — Lexicon spec: string formats include record-key and tid (context for schema validation) (atproto.com)

Citations:


Genuine collision risk: fixed clock id causes same-day posts to overwrite each other.

Posts sharing the same date will generate identical rkeys because TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE) produces the same TID when the timestamp is identical and the clock id is constant. Within a single repo + collection, the tuple (did, collection, rkey) must be unique; two records with the same rkey will cause the second to overwrite the first.

Derive the clock id deterministically from data.path (which is unique per post) to ensure distinct TIDs:

Example fix
  const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS

+  const clockId =
+    Number.parseInt(createHash('sha256').update(data.path).digest('hex').slice(0, 2), 16) % 1024
-  const tid = TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE)
+  const tid = TID.fromTime(dateInMicroSeconds, clockId)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS
// Clock id(3) needs to be the same everytime to get the same TID from a timestamp
const tid = TID.fromTime(dateInMicroSeconds, CLOCK_ID_THREE)
// client.put is async and needs to be awaited
await client.put(site.standard.document, document, {
rkey: tid.str,
const dateInMicroSeconds = new Date(result.output.date).getTime() * DATE_TO_MICROSECONDS
const clockId =
Number.parseInt(createHash('sha256').update(data.path).digest('hex').slice(0, 2), 16) % 1024
const tid = TID.fromTime(dateInMicroSeconds, clockId)
// client.put is async and needs to be awaited
await client.put(site.standard.document, document, {
rkey: tid.str,

@@ -1,4 +1,4 @@
import * as dev from '#shared/types/lexicons/dev'
import * as dev from '../types/lexicons/dev'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fatfingers23 I had to update this because it was failing in CI for me https://github.com/npmx-dev/npmx.dev/actions/runs/21649231813/job/62409192743?pr=841

Error: shared/utils/constants.ts(1,22): error TS2307: Cannot find module '#shared/types/lexicons/dev' or its corresponding type declarations.

Let me know if this change is okay!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I think #shared is the correct way and I had it wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, I wonder why #shared for me is failing 🤔

@danielroe danielroe added this pull request to the merge queue Feb 3, 2026
Merged via the queue into npmx-dev:main with commit 180317b Feb 3, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants