Skip to content

Commit ddf2652

Browse files
committed
documentation updates
1 parent 7a05c0a commit ddf2652

File tree

5 files changed

+128
-53
lines changed

5 files changed

+128
-53
lines changed

CHANGELOG.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,26 @@
11
# Changelog
22

3-
## Version 0.4 (development)
3+
## Version 0.5.0
4+
5+
- SQLAlchemy session management
6+
* Implemented proper session handling
7+
* Fixed `DetachedInstanceError` issues and added helper method `_get_detached_resource` for consistent session management
8+
* Improved transaction handling with commits and rollbacks
9+
10+
- New features
11+
* Added cache statistics with `get_stats()` method
12+
* Implemented resource tagging
13+
* Added cache size management
14+
* Added support for file compression
15+
* Added resource validation with checksums
16+
* Improved search
17+
* Added metadata export/import functionality
18+
19+
## Version 0.4.1
20+
21+
- Method to list all resources.
22+
23+
## Version 0.4
424

525
- Migrate the schema to match R/Bioconductor's BiocFileCache (Check out [this issue](https://github.com/BiocPy/pyBiocFileCache/issues/11)). Thanks to [@khoroshevskyi ](https://github.com/khoroshevskyi) for the PR.
626

README.md

Lines changed: 62 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -4,74 +4,91 @@
44

55
# pyBiocFileCache
66

7-
File system based cache for resources & metadata. Compatible with [BiocFileCache R package](https://github.com/Bioconductor/BiocFileCache)
7+
`pyBiocFileCache` is a Python package that provides a robust file caching system with resource validation, cache size management, file compression, and resource tagging. Compatible with [BiocFileCache R package](https://github.com/Bioconductor/BiocFileCache).
88

9-
***Note: Package is in development. Use with caution!!***
9+
## Installation
1010

11-
### Installation
11+
Install from [PyPI](https://pypi.org/project/pyBiocFileCache/),
1212

13-
Package is published to [PyPI](https://pypi.org/project/pyBiocFileCache/)
14-
15-
```
13+
```bash
1614
pip install pybiocfilecache
1715
```
1816

19-
#### Initialize a cache directory
20-
21-
```
22-
from pybiocfilecache import BiocFileCache
23-
import os
17+
## Quick Start
2418

25-
bfc = BiocFileCache(cache_dir = os.getcwd() + "/cache")
26-
```
19+
```python
20+
from biocfilecache import BiocFileCache
2721

28-
Once the cache directory is created, the library provides methods to
29-
- `add`: Add a resource or artifact to cache
30-
- `get`: Get the resource from cache
31-
- `remove`: Remove a resource from cache
32-
- `update`: update the resource in cache
33-
- `purge`: purge the entire cache, removes all files in the cache directory
22+
# Initialize cache
23+
cache = BiocFileCache("path/to/cache/directory")
3424

35-
### Add a resource to cache
25+
# Add a file to cache
26+
resource = cache.add("myfile", "path/to/file.txt")
3627

37-
(for testing use the temp files in the `tests/data` directory)
28+
# Retrieve a file from cache
29+
resource = cache.get("myfile")
3830

39-
```
40-
rec = bfc.add("test1", os.getcwd() + "/test1.txt")
41-
print(rec)
31+
# Use the cached file
32+
print(resource.rpath) # Path to cached file
4233
```
4334

44-
### Get resource from cache
35+
## Advanced Usage
4536

46-
```
47-
rec = bfc.get("test1")
48-
print(rec)
49-
```
37+
### Configuration
5038

51-
### Remove resource from cache
39+
```python
40+
from biocfilecache import BiocFileCache, CacheConfig
41+
from datetime import timedelta
42+
from pathlib import Path
5243

53-
```
54-
rec = bfc.remove("test1")
55-
print(rec)
44+
# Create custom configuration
45+
config = CacheConfig(
46+
cache_dir=Path("cache_directory"),
47+
max_size_bytes=1024 * 1024 * 1024, # 1GB
48+
cleanup_interval=timedelta(days=7),
49+
compression=True
50+
)
51+
52+
# Initialize cache with configuration
53+
cache = BiocFileCache(config=config)
5654
```
5755

58-
### Update resource in cache
56+
### Resource Management
5957

60-
```
61-
rec = bfc.get("test1"m os.getcwd() + "test2.txt")
62-
print(rec)
63-
```
58+
```python
59+
# Add file with tags and expiration
60+
from datetime import datetime, timedelta
6461

65-
### purge the cache
62+
resource = cache.add(
63+
"myfile",
64+
"path/to/file.txt",
65+
tags=["data", "raw"],
66+
expires=datetime.now() + timedelta(days=30)
67+
)
6668

67-
```
68-
bfc.purge()
69+
# List resources by tag
70+
resources = cache.list_resources(tag="data")
71+
72+
# Search resources
73+
results = cache.search("myfile", field="rname")
74+
75+
# Update resource
76+
cache.update("myfile", "path/to/new_file.txt")
77+
78+
# Remove resource
79+
cache.remove("myfile")
6980
```
7081

82+
### Cache Statistics and Maintenance
7183

72-
<!-- pyscaffold-notes -->
84+
```python
85+
# Get cache statistics
86+
stats = cache.get_stats()
87+
print(stats)
7388

74-
## Note
89+
# Clean up expired resources
90+
removed_count = cache.cleanup()
7591

76-
This project has been set up using PyScaffold 4.1. For details and usage
77-
information on PyScaffold see https://pyscaffold.org/.
92+
# Purge entire cache
93+
cache.purge()
94+
```

docs/best_practices.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Best Practices
2+
3+
1. Use context managers for cleanup:
4+
```python
5+
with BiocFileCache("cache_directory") as cache:
6+
cache.add("myfile", "path/to/file.txt")
7+
```
8+
9+
2. Add tags for better organization:
10+
```python
11+
cache.add("data.csv", "data.csv", tags=["raw", "csv", "2024"])
12+
```
13+
14+
3. Set expiration dates for temporary files:
15+
```python
16+
cache.add("temp.txt", "temp.txt", expires=datetime.now() + timedelta(hours=1))
17+
```
18+
19+
4. Regular maintenance:
20+
```python
21+
# Periodically clean up expired resources
22+
cache.cleanup()
23+
24+
# Monitor cache size
25+
stats = cache.get_stats()
26+
if stats["cache_size_bytes"] > threshold:
27+
# Take action
28+
```

setup.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ package_dir =
4747
# For more information, check out https://semver.org/.
4848
install_requires =
4949
importlib-metadata; python_version<"3.8"
50-
sqlalchemy>=2,<2.1
50+
sqlalchemy
5151

5252
[options.packages.find]
5353
where = src

src/pybiocfilecache/cache.py

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,8 @@ def __init__(self, cache_dir: Optional[Union[str, Path]] = None, config: Optiona
5353
Args:
5454
cache_dir:
5555
Path to cache directory.
56-
5756
Defaults to tmp location, :py:func:`~.utils.create_tmp_dir`.
57+
Ignored if config already contains the path to the cache directory.
5858
5959
config:
6060
Optional configuration.
@@ -281,7 +281,7 @@ def update(
281281
Defaults to ``copy``.
282282
283283
tags:
284-
Additional tags to update the resource.
284+
Optional new list of tags.
285285
286286
Returns:
287287
Updated `Resource` object.
@@ -318,8 +318,11 @@ def update(
318318
def remove(self, rname: str) -> None:
319319
"""Remove a resource from cache by name.
320320
321+
Removes both the cached file and its database entry.
322+
321323
Args:
322-
rname: Name of the resource to remove
324+
rname:
325+
Name of the resource to remove
323326
324327
Raises:
325328
BiocCacheError: If resource removal fails
@@ -355,7 +358,10 @@ def list_resources(
355358
Filter by resource type.
356359
357360
expired:
358-
Filter by expiration status.
361+
Filter by expiration status
362+
True: only expired resources
363+
False: only non-expired resources
364+
None: all resources
359365
360366
Returns:
361367
List of matching Resource objects.
@@ -381,6 +387,9 @@ def cleanup(self) -> int:
381387
382388
Returns:
383389
Number of resources removed.
390+
391+
Note:
392+
Updates `_last_cleanup` timestamp after completion.
384393
"""
385394
removed = 0
386395
with self.get_session() as session:
@@ -403,7 +412,8 @@ def validate_resource(self, resource: Resource) -> bool:
403412
"""Validate resource integrity.
404413
405414
Args:
406-
resource: Resource to validate.
415+
resource:
416+
Resource to validate.
407417
408418
Returns:
409419
True if resource is valid, False otherwise.
@@ -482,7 +492,7 @@ def search(self, query: str, field: str = "rname", exact: bool = False) -> List[
482492
Search string.
483493
484494
field:
485-
Field to search.
495+
Resource field to search ("rname", "rtype", "tags", etc.).
486496
487497
exact:
488498
Whether to require exact match.

0 commit comments

Comments
 (0)