@@ -173,6 +173,93 @@ The cache timestamp functionality is fully backward compatible:
173
173
* No changes to Repository or ProjectDirectory APIs
174
174
* All existing code continues to work unchanged
175
175
176
+ Best Practices
177
+ --------------
178
+
179
+ Shared Cache Usage
180
+ ~~~~~~~~~~~~~~~~~~
181
+
182
+ .. warning ::
183
+ **Recommendation: Use Separate Cache Instances **
184
+
185
+ While it's technically possible to share the same cache object across multiple Repository instances,
186
+ we **strongly recommend using separate cache instances ** for each repository for the following reasons:
187
+
188
+ **Recommended Approach - Separate Caches: **
189
+
190
+ .. code-block :: python
191
+
192
+ from gitpandas import Repository
193
+ from gitpandas.cache import DiskCache
194
+
195
+ # Create separate cache instances for each repository
196
+ cache1 = DiskCache(filepath = ' repo1_cache.gz' )
197
+ cache2 = DiskCache(filepath = ' repo2_cache.gz' )
198
+
199
+ repo1 = Repository(' /path/to/repo1' , cache_backend = cache1)
200
+ repo2 = Repository(' /path/to/repo2' , cache_backend = cache2)
201
+
202
+ **Benefits of Separate Caches: **
203
+
204
+ * **Complete Isolation **: No risk of cache eviction conflicts between repositories
205
+ * **Predictable Memory Usage **: Each repository has its own memory budget
206
+ * **Easier Debugging **: Cache issues are isolated to specific repositories
207
+ * **Better Performance **: No lock contention in multi-threaded scenarios
208
+ * **Clear Cache Management **: You can clear or manage each repository's cache independently
209
+
210
+ **If You Must Share Caches: **
211
+
212
+ If you need to share a cache object across multiple repositories (e.g., for memory constraints),
213
+ the system is designed to handle this safely:
214
+
215
+ .. code-block :: python
216
+
217
+ from gitpandas import Repository
218
+ from gitpandas.cache import EphemeralCache
219
+
220
+ # Shared cache (not recommended but supported)
221
+ shared_cache = EphemeralCache(max_keys = 1000 )
222
+
223
+ repo1 = Repository(' /path/to/repo1' , cache_backend = shared_cache)
224
+ repo2 = Repository(' /path/to/repo2' , cache_backend = shared_cache)
225
+
226
+ # Each repository gets separate cache entries
227
+ files1 = repo1.list_files() # Creates cache key: list_files||repo1||None
228
+ files2 = repo2.list_files() # Creates cache key: list_files||repo2||None
229
+
230
+ **Shared Cache Considerations: **
231
+
232
+ * Repository names are included in cache keys to prevent collisions
233
+ * Cache eviction affects all repositories sharing the cache
234
+ * Memory usage is shared across all repositories
235
+ * Very active repositories may evict cache entries from less active ones
236
+
237
+ Cache Size Planning
238
+ ~~~~~~~~~~~~~~~~~~~
239
+
240
+ When planning cache sizes, consider:
241
+
242
+ * **Repository Size **: Larger repositories generate more cache entries
243
+ * **Operation Types **: Some operations (like ``cumulative_blame ``) create many cache entries
244
+ * **Memory Constraints **: Balance cache size with available system memory
245
+ * **Analysis Patterns **: Frequently repeated analyses benefit from larger caches
246
+
247
+ **Recommended Cache Sizes: **
248
+
249
+ .. code-block :: python
250
+
251
+ # Small repositories (< 1000 commits)
252
+ cache = EphemeralCache(max_keys = 100 )
253
+
254
+ # Medium repositories (1000-10000 commits)
255
+ cache = EphemeralCache(max_keys = 500 )
256
+
257
+ # Large repositories (> 10000 commits)
258
+ cache = EphemeralCache(max_keys = 1000 )
259
+
260
+ # For disk/Redis caches, you can use larger sizes
261
+ cache = DiskCache(filepath = ' cache.gz' , max_keys = 5000 )
262
+
176
263
API Reference
177
264
-------------
178
265
0 commit comments