Skip to content

[WIP] feature: shell integration 💻#508

Draft
tnaum-ms wants to merge 100 commits intonextfrom
feature/shell-integration
Draft

[WIP] feature: shell integration 💻#508
tnaum-ms wants to merge 100 commits intonextfrom
feature/shell-integration

Conversation

@tnaum-ms
Copy link
Collaborator

@tnaum-ms tnaum-ms commented Feb 17, 2026

Shell Integration — DocumentDB Query Language & Autocomplete

Umbrella PR for the shell integration feature: a custom documentdb-query Monaco language with intelligent autocomplete, hover docs, and validation across all query editor surfaces (filter, project, sort, aggregation, shell).

Work is organized as incremental steps, each delivered via a dedicated sub-PR merged into feature/shell-integration.


Progress

  • Step 1 — Schema Tool Decision — Evaluated schema analysis approaches, decided to enhance SchemaAnalyzer (JSON Schema output, incremental merge, 24 BSON types)
  • Step 2 — SchemaAnalyzer Refactoring · refactor: SchemaAnalyzer class + enhanced FieldEntry + new schema transformers #506 — Extracted @vscode-documentdb/schema-analyzer package, enriched FieldEntry with BSON types, added schema transformers, introduced monorepo structure
  • Step 3 — documentdb-constants Package · feat: add documentdb-constants package — operator metadata for autocomplete #513 — 308 operator entries (DocumentDB API query operators, update operators, stages, accumulators, BSON constructors, system variables) as static metadata for autocomplete
  • Step 3.5 — Monaco Language Architecture — Selected documentdb-query custom language with JS Monarch tokenizer (no TS worker), validated via POC across 8 test criteria
  • Step 4 — Filter CompletionItemProvider · feat: documentdb-query language — CompletionItemProvider, HoverProvider, acorn validation #518documentdb-query language registration, per-editor model URIs, completion data store, CompletionItemProvider (filter/project/sort), HoverProvider, acorn validation, $-prefix fix, query parser replacement (shell-bson-parser), type-aware operator sorting, legacy JSON Schema pipeline removal
  • Step 4.5 — Context-Sensitive Completions · feat: context-sensitive completions — cursor-aware filtering & type suggestions #530 ✅ — Cursor-position-aware filtering (key/value/operator/array-element), type-aware value suggestions (bool → true/false, number → range query, etc.), snippet escape fix, completion item styling, refactored into completions/ folder
  • Step 4.6 — Collection View & Autocompletion UX Improvements · feat: collection view & autocompletion UX improvements (Step 4.6) #532 — Clickable doc links, field completion persistence, $not fix, project/sort value completions, auto-trigger characters, smart-trigger, field hover provider, quoted key hover, hover link handler, category coverage tests
  • Step 5 — Legacy Scrapbook Removal
  • Step 6 — Scrapbook Rebuild (New Shell)
  • Step 7 — Shell CompletionItemProvider
  • Step 8 — Aggregation CompletionItemProvider

Key Architecture Decisions

Decision Outcome
Language strategy documentdb-query custom language — JS Monarch tokenizer, no TS worker (~400-600 KB saved)
Completion providers Single CompletionItemProvider + URI routing (documentdb://{editorType}/{sessionId})
Completion data documentdb-constants bundled at build time; field data pushed via tRPC subscription
Validation acorn.parseExpressionAt() for syntax errors; acorn-walk + documentdb-constants for identifier validation
Document editors Stay on language="json" with JSON Schema validation
Shell/Scrapbook (future) language="javascript" with full TS service + .d.ts via addExtraLib()

Sub-PRs

PR Step Title Status
#506 2 refactor: SchemaAnalyzer class + enhanced FieldEntry + new schema transformers ✅ Merged
#513 3 feat: add documentdb-constants package — operator metadata for autocomplete ✅ Merged
#518 4 feat: documentdb-query language — CompletionItemProvider, HoverProvider, acorn validation ✅ Merged
#530 4.5 feat: context-sensitive completions — cursor-aware filtering & type suggestions ✅ Merged
#532 4.6 feat: collection view & autocompletion UX improvements 🔨 Open

tnaum-ms and others added 27 commits February 16, 2026 20:16
… stats bugs

Group A of SchemaAnalyzer refactor:
- Fix A1: array element stats overwrite bug (isNewTypeEntry)
- Fix A2: probability >100% for array-embedded objects (x-documentsInspected)
- Rename folder: src/utils/json/mongo/ → src/utils/json/data-api/
- Rename enum: MongoBSONTypes → BSONTypes
- Rename file: MongoValueFormatters → ValueFormatters
- Add 9 new tests for array stats and probability
Group B of SchemaAnalyzer refactor:
- B1: SchemaAnalyzer class with addDocument(), getSchema(), reset(), getDocumentCount()
- B2: clone() method using structuredClone for schema branching
- B3: addDocuments() batch convenience method
- B4: static fromDocument()/fromDocuments() factories (replaces getSchemaFromDocument)
- B5: Migrate ClusterSession to use SchemaAnalyzer instance
- B6-B7: Remove old free functions (updateSchemaWithDocument, getSchemaFromDocument)
- Keep getPropertyNamesAtLevel, getSchemaAtPath, buildFullPaths as standalone exports
…x properties type

Group C of SchemaAnalyzer refactor:
- C1: Add typed x-minValue, x-maxValue, x-minLength, x-maxLength, x-minDate,
  x-maxDate, x-trueCount, x-falseCount, x-minItems, x-maxItems,
  x-minProperties, x-maxProperties to JSONSchema interface
- C2: Fix properties type: properties?: JSONSchema → properties?: JSONSchemaMap
- C3: Fix downstream type errors in SchemaAnalyzer.test.ts (JSONSchemaRef casts)
…temBsonType

Group D of SchemaAnalyzer refactor:
- D1: Add bsonType to FieldEntry (dominant BSON type from x-bsonType)
- D2: Add bsonTypes[] for polymorphic fields (2+ distinct types)
- D3: Add isOptional flag (x-occurrence < parent x-documentsInspected)
- D4: Add arrayItemBsonType for array fields (dominant element BSON type)
- D5: Sort results: _id first, then alphabetical by path
- D6: Verified generateMongoFindJsonSchema still works (additive changes)
- G4: Add 7 getKnownFields tests covering all new fields
… toFieldCompletionItems)

Group E of SchemaAnalyzer refactor:
- E1: generateDescriptions() — post-processor adding human-readable description
  strings with type info, occurrence percentage, and min/max stats
- E2: toTypeScriptDefinition() — generates TypeScript interface strings from
  JSONSchema for shell addExtraLib() integration
- E3: toFieldCompletionItems() — converts FieldEntry[] to CompletionItemProvider-
  ready FieldCompletionData[] with insert text escaping and $ references

Also:
- Rename isOptional → isSparse in FieldEntry and FieldCompletionData
  (all fields are implicitly optional in MongoDB API / DocumentDB API;
  isSparse is a statistical observation, not a constraint)
- Fix lint errors (inline type specifiers)
- 18 new tests for transformers + updated existing tests
- Add 5 tests for clone(), reset(), fromDocument(), fromDocuments(), addDocuments()
- Mark all checklist items A-G as complete, F1-F2 as deferred
- Add Manual Test Plan section (§14) with 5 end-to-end test scenarios
- Document clone() limitation with BSON Binary types (structuredClone)
- Add monotonic version counter to SchemaAnalyzer (incremented on mutations)
- Cache getKnownFields() with version-based staleness check
- Add ClusterSession.getKnownFields() accessor (delegates to cached analyzer)
- Wire collectionViewRouter to use session.getKnownFields() instead of standalone function
- Add ext.outputChannel.trace for schema accumulation and reset events
Co-authored-by: tnaum-ms <171359267+tnaum-ms@users.noreply.github.com>
Move SchemaAnalyzer, JSONSchema types, BSONTypes, ValueFormatters, and
getKnownFields into packages/schema-analyzer as @vscode-documentdb/schema-analyzer.

- Set up npm workspaces (packages/*) and TS project references
- Update all extension-side imports to use the new package
- Configure Jest multi-project for both extension and package tests
- Remove @vscode/l10n dependency from core (replaced with plain Error)
- Fix strict-mode type issues (localeCompare bug, index signatures)
- Update .gitignore to include root packages/ directory
- Add packages/ to prettier glob
…itions

The bsonToTypeScriptMap emits non-built-in type names (ObjectId, Binary,
Timestamp, etc.) without corresponding import statements or declare stubs.
Currently harmless since the output is for display/hover only, but should
be addressed if the TS definition is ever consumed by a real TS language
service.

Addresses PR #506 review comment from copilot.
…ion names

- Prefix with _ when PascalCase result starts with a digit (e.g. '123abc' → '_123abcDocument')
- Fall back to 'CollectionDocument' when name is empty or only separators
- Filter empty segments from split result
- Add tests for edge cases

Addresses PR #506 review comment from copilot.
Add comment explaining why the cast to JSONSchema is safe: our
SchemaAnalyzer never produces boolean schema refs. Notes that a
typeof guard should be added if the function is ever reused with
externally-sourced schemas.

Addresses PR #506 review comment from copilot.
…lashes

- Replace SPECIAL_CHARS_PATTERN with JS_IDENTIFIER_PATTERN for proper
  identifier validity check (catches dashes, brackets, digits, quotes, etc.)
- Escape embedded double quotes and backslashes when quoting insertText
- Add tests for all edge cases (dashes, brackets, digits, quotes, backslashes)
- Mark future-work item #1 as resolved; item #2 (referenceText/$getField)
  remains open for aggregation completion provider phase

Addresses PR #506 review comment from copilot.
tnaum-ms added 30 commits March 16, 2026 17:06
Implements detectCursorContext() - a pure function that determines the
semantic position of the cursor within a DocumentDB query expression.

Supports: key, value, operator, array-element, and unknown positions.
Uses backward character scanning (no AST parsing) for robustness with
incomplete/mid-typing input.

Includes 41 tests covering:
- Core context detection (complete expressions)
- Incomplete/broken input (Step 1.5 from plan)
- Multi-line expressions
- fieldLookup integration for BSON type enrichment
Wire CursorContext into createCompletionItems() to filter completions
based on the semantic cursor position:

- key position: field names + key-position operators (, , etc.)
- value position: BSON constructors only (ObjectId, UUID, ISODate)
- operator position: comparison/element/array operators with type-aware sort
- array-element: same as key position
- unknown/undefined: full list (backward compatible)

All 48 existing tests pass unchanged. 16 new context-sensitive tests added.
Call detectCursorContext() in provideCompletionItems to detect the
semantic cursor position from editor text + offset. Pass the detected
CursorContext through to createCompletionItems() for context-sensitive
filtering.

Includes field type lookup via completionStore to enrich context with
BSON types for type-aware operator sorting.
Issue A - Strip outer braces from operator snippets at operator position:
  { $gt: ${1:value} } → $gt: ${1:value}  (when inside { field: { | } })
  Prevents double-brace insertion bug.

Issue B - Add category labels to completion items:
  Uses CompletionItemLabel.detail for operator category (comparison, logical, etc.)
  Uses CompletionItemLabel.description for operator description text.

Issue C - Value position now shows operators + BSON constructors:
  Operators sorted first (0_), BSON constructors second (1_).
  Operators keep their full brace-wrapping snippets at value position.

Also: trace-level console.debug logging in provideCompletionItems for debugging.

178 tests pass across 6 test suites. Build + lint clean.
- Remove 'detail' field from operator completions (was noisy)
- Move category label to label.description (matches field items pattern)
- Move entry.description to documentation (was in both detail and desc)
- Combine description text + docs link in documentation field
- Add range/insertText to trace logging for snippet debugging
Root cause: Monaco snippet syntax interprets $gt as a variable
reference (variable 'gt'), which resolves to empty string for unknown
variables. Result: { $gt: ${1:value} } produces '{ : value }'.

Fix: escapeSnippetDollars() escapes $ before letters ($gt → \$gt)
while preserving tab stop syntax (${1:value} unchanged).
BSON constructors like ObjectId("${1:hex}") are unaffected.

This was the root cause of POC Observation 2 (T5 ⚠️ PARTIAL).
…estions

Refactor:
- Extract completion logic into completions/ folder:
  - snippetUtils.ts: stripOuterBraces, escapeSnippetDollars
  - mapCompletionItems.ts: operator/field → CompletionItem mapping
  - typeSuggestions.ts: type-aware value suggestions (NEW)
  - createCompletionItems.ts: main entry point with context branching
  - index.ts: barrel exports
- documentdbQueryCompletionProvider.ts now re-exports from completions/

New feature — type-aware value suggestions at value position:
- bool fields → true, false (as first completions)
- int/double/long/decimal → range query { $gt: ▪, $lt: ▪ }
- string → regex snippet, empty string literal
- date → ISODate constructor, date range snippet
- objectid → ObjectId constructor
- null → null literal
- array → $elemMatch, $size snippets

Suggestions get sort prefix 00_ (before operators at 0_).
First suggestion is preselected.

197 tests pass across 6 suites. Build + lint clean.
The schema analyzer uses BSONTypes enum strings ('boolean', 'int32',
'decimal128') while the type suggestions used shortened names ('bool',
'int', 'decimal'). Fix the TYPE_SUGGESTIONS keys to match:
- bool → boolean
- int → int32
- decimal → decimal128
- Added: number (generic BSONTypes.Number)

Also noted the type-name mismatch between schema-analyzer BSONTypes
('int32') and documentdb-constants applicableBsonTypes ('int') —
tracked as a future normalization task.
Replace format pattern 'yyyy-MM-ddTHH:mm:ssZ' with a valid example
'2025-01-01T00:00:00Z'. The format pattern was confusing — users
typed over it with values like '2222-11-11T11:11:111' which fail
ISODate validation.

Also update date range snippet with realistic start/end dates.
Previously `new Daddddte()` was not flagged because the validator only
handled CallExpression nodes. NewExpression has the same callee shape
but was not visited. Now unknown constructors in `new Foo()` produce
error diagnostics (red squiggles), and near-miss typos like `new Dae()`
produce warnings.
Add Date.now(), Math.floor(), Math.ceil(), Math.round(), Math.min(),
Math.max() as individual completion items alongside the class-level
Date/Math entries. Removed the bare `Math` entry since the individual
methods are more useful. All entries use sort prefix 4_ (after BSON
constructors at 3_).
…_COMPLETION_META

PROJECTION_COMPLETION_META was ['field:identifier'] which returned 0 static
entries — projection operators ($slice, $elemMatch, $) and BSON constructors
were never shown in project/sort editors. Added 'query:projection' and 'bson'
to the meta preset.
Never use git add -f to override .gitignore. Files in docs/plan/ and
docs/analysis/ are local planning documents excluded from the repo.
- Remove unnecessary escape in template literal (no-useless-escape)
- Formatter-applied changes to imports and whitespace
Tests were using 'int' but the schema analyzer produces 'int32' (BSONTypes.Int32).
Also fixed applicableBsonTypes in documentdb-constants bitwise operators to use
'int32' to match, which was a real production mismatch.
When cursorContext is undefined or 'unknown', show the full set of completions
(fields, all operators, BSON constructors, JS globals) instead of only key-position
items. This is a better UX fallback — when context can't be determined, the user
is better served by seeing everything available.
These debug logs were used during development and logged full editor text and
completion items to the webview console on every keystroke. Removed to avoid
unnecessary noise and potential data exposure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Scrapbook Experience (shell integration) 🚀

2 participants