PyMongoSQL is a Python DB API 2.0 (PEP 249) client for MongoDB. It provides a familiar SQL interface to MongoDB, allowing developers to use SQL to interact with MongoDB collections.
PyMongoSQL implements the DB API 2.0 interfaces to provide SQL-like access to MongoDB, built on PartiQL syntax for querying semi-structured data. The project aims to:
- Bridge SQL and NoSQL: Provide SQL capabilities for MongoDB's nested document structures
- Standard SQL Operations: Support DQL (SELECT) and DML (INSERT, UPDATE, DELETE) operations with WHERE, ORDER BY, and LIMIT clauses
- Seamless Integration: Full compatibility with Python applications expecting DB API 2.0 compliance
- Easy Migration: Enable migration from traditional SQL databases to MongoDB without rewriting application code
- DB API 2.0 Compliant: Full compatibility with Python Database API 2.0 specification
- PartiQL-based SQL Syntax: Built on PartiQL (SQL for semi-structured data), enabling seamless SQL querying of nested and hierarchical MongoDB documents
- Nested Structure Support: Query and filter deeply nested fields and arrays within MongoDB documents using standard SQL syntax
- SQLAlchemy Integration: Complete ORM and Core support with dedicated MongoDB dialect
- SQL Query Support: SELECT statements with WHERE conditions, field selection, and aliases
- DML Support: Full support for INSERT, UPDATE, and DELETE operations using PartiQL syntax
- Connection String Support: MongoDB URI format for easy configuration
- Python: 3.9, 3.10, 3.11, 3.12, 3.13+
- MongoDB: 7.0+
-
PyMongo (MongoDB Python Driver)
- pymongo >= 4.15.0
-
ANTLR4 (SQL Parser Runtime)
- antlr4-python3-runtime >= 4.13.0
-
JMESPath (JSON/Dict Path Query)
- jmespath >= 1.0.0
- SQLAlchemy (for ORM/Core support)
- sqlalchemy >= 1.4.0 (SQLAlchemy 1.4+ and 2.0+ supported)
pip install pymongosqlOr install from source:
git clone https://github.com/your-username/PyMongoSQL.git
cd PyMongoSQL
pip install -e .Table of Contents:
- Basic Usage
- Using Connection String
- Context Manager Support
- Using DictCursor for Dictionary Results
- Cursor vs DictCursor
- Query with Parameters
- Supported SQL Features
- Apache Superset Integration
- Limitations & Roadmap
- Contributing
- License
from pymongosql import connect
# Connect to MongoDB
connection = connect(
host="mongodb://localhost:27017",
database="database"
)
cursor = connection.cursor()
cursor.execute('SELECT name, email FROM users WHERE age > 25')
print(cursor.fetchall())from pymongosql import connect
# Connect with authentication
connection = connect(
host="mongodb://username:password@localhost:27017/database?authSource=admin"
)
cursor = connection.cursor()
cursor.execute('SELECT * FROM products WHERE category = ?', ['Electronics'])
for row in cursor:
print(row)from pymongosql import connect
with connect(host="mongodb://localhost:27017/database") as conn:
with conn.cursor() as cursor:
cursor.execute('SELECT COUNT(*) as total FROM users')
result = cursor.fetchone()
print(f"Total users: {result[0]}")from pymongosql import connect
from pymongosql.cursor import DictCursor
with connect(host="mongodb://localhost:27017/database") as conn:
with conn.cursor(DictCursor) as cursor:
cursor.execute('SELECT COUNT(*) as total FROM users')
result = cursor.fetchone()
print(f"Total users: {result['total']}")PyMongoSQL provides two cursor types for different result formats:
Cursor (default) - Returns results as tuples:
cursor = connection.cursor()
cursor.execute('SELECT name, email FROM users')
row = cursor.fetchone()
print(row[0]) # Access by indexDictCursor - Returns results as dict:
from pymongosql.cursor import DictCursor
cursor = connection.cursor(DictCursor)
cursor.execute('SELECT name, email FROM users')
row = cursor.fetchone()
print(row['name']) # Access by column namePyMongoSQL supports two styles of parameterized queries for safe value substitution:
Positional Parameters with ?
from pymongosql import connect
connection = connect(host="mongodb://localhost:27017/database")
cursor = connection.cursor()
cursor.execute(
'SELECT name, email FROM users WHERE age > ? AND status = ?',
[25, 'active']
)Named Parameters with :name
from pymongosql import connect
connection = connect(host="mongodb://localhost:27017/database")
cursor = connection.cursor()
cursor.execute(
'SELECT name, email FROM users WHERE age > :age AND status = :status',
{'age': 25, 'status': 'active'}
)Parameters are substituted into the MongoDB filter during execution, providing protection against injection attacks.
- Field selection:
SELECT name, age FROM users - Wildcards:
SELECT * FROM products - Field aliases:
SELECT name AS user_name, age AS user_age FROM users - Nested fields:
SELECT profile.name, profile.age FROM users - Array access:
SELECT items[0], items[1].name FROM orders
- Equality:
WHERE name = 'John' - Comparisons:
WHERE age > 25,WHERE price <= 100.0 - Logical operators:
WHERE age > 18 AND status = 'active',WHERE age < 30 OR role = 'admin' - Nested field filtering:
WHERE profile.status = 'active' - Array filtering:
WHERE items[0].price > 100
- Single-level:
profile.name,settings.theme - Multi-level:
account.profile.name,config.database.host - Array access:
items[0].name,orders[1].total - Complex queries:
WHERE customer.profile.age > 18 AND orders[0].status = 'paid'
Note: Avoid SQL reserved words (
user,data,value,count, etc.) as unquoted field names. Use alternatives or bracket notation for arrays.
- ORDER BY:
ORDER BY name ASC, age DESC - LIMIT:
LIMIT 10 - Combined:
ORDER BY created_at DESC LIMIT 5
PyMongoSQL supports inserting documents into MongoDB collections using both PartiQL-style object literals and standard SQL INSERT VALUES syntax.
Single Document
cursor.execute(
"INSERT INTO Music {'title': 'Song A', 'artist': 'Alice', 'year': 2021}"
)Multiple Documents (Bag Syntax)
cursor.execute(
"INSERT INTO Music << {'title': 'Song B', 'artist': 'Bob'}, {'title': 'Song C', 'artist': 'Charlie'} >>"
)Parameterized INSERT
# Positional parameters using ? placeholders
cursor.execute(
"INSERT INTO Music {'title': '?', 'artist': '?', 'year': '?'}",
["Song D", "Diana", 2020]
)Single Row with Column List
cursor.execute(
"INSERT INTO Music (title, artist, year) VALUES ('Song E', 'Eve', 2022)"
)Multiple Rows
cursor.execute(
"INSERT INTO Music (title, artist, year) VALUES ('Song F', 'Frank', 2023), ('Song G', 'Grace', 2024)"
)Parameterized INSERT VALUES
# Positional parameters (?)
cursor.execute(
"INSERT INTO Music (title, artist, year) VALUES (?, ?, ?)",
["Song H", "Henry", 2025]
)
# Named parameters (:name)
cursor.execute(
"INSERT INTO Music (title, artist) VALUES (:title, :artist)",
{"title": "Song I", "artist": "Iris"}
)PyMongoSQL supports updating documents in MongoDB collections using standard SQL UPDATE syntax.
Update All Documents
cursor.execute("UPDATE Music SET available = false")Update with WHERE Clause
cursor.execute("UPDATE Music SET price = 14.99 WHERE year < 2020")Update Multiple Fields
cursor.execute(
"UPDATE Music SET price = 19.99, available = true WHERE artist = 'Alice'"
)Update with Logical Operators
cursor.execute(
"UPDATE Music SET price = 9.99 WHERE year = 2020 AND stock > 5"
)Parameterized UPDATE
# Positional parameters using ? placeholders
cursor.execute(
"UPDATE Music SET price = ?, stock = ? WHERE artist = ?",
[24.99, 50, "Bob"]
)Update Nested Fields
cursor.execute(
"UPDATE Music SET details.publisher = 'XYZ Records' WHERE title = 'Song A'"
)Check Updated Row Count
cursor.execute("UPDATE Music SET available = false WHERE year = 2020")
print(f"Updated {cursor.rowcount} documents")PyMongoSQL supports deleting documents from MongoDB collections using standard SQL DELETE syntax.
Delete All Documents
cursor.execute("DELETE FROM Music")Delete with WHERE Clause
cursor.execute("DELETE FROM Music WHERE year < 2020")Delete with Logical Operators
cursor.execute(
"DELETE FROM Music WHERE year = 2019 AND available = false"
)Parameterized DELETE
# Positional parameters using ? placeholders
cursor.execute(
"DELETE FROM Music WHERE artist = ? AND year < ?",
["Charlie", 2021]
)Check Deleted Row Count
cursor.execute("DELETE FROM Music WHERE available = false")
print(f"Deleted {cursor.rowcount} documents")PyMongoSQL supports DB API 2.0 transactions for ACID-compliant database operations. Use the begin(), commit(), and rollback() methods to manage transactions:
from pymongosql import connect
connection = connect(host="mongodb://localhost:27017/database")
try:
connection.begin() # Start transaction
cursor = connection.cursor()
cursor.execute('UPDATE accounts SET balance = 100 WHERE id = ?', [1])
cursor.execute('UPDATE accounts SET balance = 200 WHERE id = ?', [2])
connection.commit() # Commit all changes
print("Transaction committed successfully")
except Exception as e:
connection.rollback() # Rollback on error
print(f"Transaction failed: {e}")
finally:
connection.close()Note: MongoDB requires a replica set or sharded cluster for transaction support. Standalone MongoDB servers do not support ACID transactions at the server level.
PyMongoSQL can be used as a database driver in Apache Superset for querying and visualizing MongoDB data:
- Install PyMongoSQL: Install PyMongoSQL on the Superset app server:
pip install pymongosql
- Create Connection: Connect to your MongoDB instance using the connection URI with superset mode:
or for MongoDB Atlas:
mongodb://username:password@host:port/database?mode=supersetmongodb+srv://username:password@host/database?mode=superset - Use SQL Lab: Write and execute SQL queries against MongoDB collections directly in Superset's SQL Lab
- Create Visualizations: Build charts and dashboards from your MongoDB queries using Superset's visualization tools
This allows seamless integration between MongoDB data and Superset's BI capabilities without requiring data migration to traditional SQL databases.
Note: PyMongoSQL currently supports DQL (Data Query Language) and DML (Data Manipulation Language) operations. The following SQL features are not yet supported but are planned for future releases:
- Advanced DML Operations
REPLACE,MERGE,UPSERT
These features are on our development roadmap and contributions are welcome!
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
PyMongoSQL is distributed under the MIT license.