Skip to content

Conversation

@BalzaniEdoardo
Copy link
Collaborator

Summary

This PR refactors and significantly improves the GLM-HMM Expectation-Maximization (EM) algorithm implementation, with major improvements to code organization, performance, and test coverage.

Key Changes

GLM-HMM EM Algorithm Improvements

  • Modularized EM implementation: Refactored the EM algorithm for better maintainability and composability
  • Improved parameter structure: Changed from single projection_weights to tuple-based glm_params (coef, intercept) for clearer separation of concerns
  • Enhanced convergence checking: Added proper convergence detection with configurable tolerance and tracking of previous log-likelihoods
  • Better JIT compilation: Improved static argument handling and compilation behavior for faster execution
  • Accurate likelihood computation: Now uses actual data log-likelihood instead of approximate metrics

Test Suite Reorganization

  • Consolidated GLM-HMM tests: Moved all GLM-HMM-specific tests to test_glm_hmm_algorithms.py (+2,421 lines)
  • Extracted common regressor tests: Created test_base_regressor_subclasses.py (+1,143 lines) to avoid duplication across GLM and GLM-HMM tests
  • Reduced test file sizes: Significantly streamlined test_glm.py (-1,296 lines net) by moving shared tests
  • Removed obsolete tests: Deleted old test_glm_hmm.py (-1,234 lines) in favor of the new comprehensive test suite
  • Improved test performance: Optimized test execution speed

Code Architecture Improvements

  • Generic base regressor: Made BaseRegressor generic with ParamsT type parameter for better type safety
  • Refactored parameter initialization: Split initialize_params() into public API and private _initialize_parameters() for clearer responsibilities
  • Reorganized utilities: Moved inverse_link_function_utils.py from glm/ to top-level src/nemos/ for better accessibility
  • Enhanced type casting: Added utilities in type_casting.py for consistent JAX array conversion

New Features

  • GLM-HMM simulation script: Added scripts/generate_simulation_glm_hmm.py for generating synthetic GLM-HMM datasets with configurable neurons and states
  • Improved likelihood handling: Added prepare_likelihood_func() for consistent likelihood function preparation across population and single-neuron GLMs

Other Improvements

  • Updated regularizer handling and validation logic
  • Enhanced solver instantiation to accept loss functions directly
  • Various bug fixes and code quality improvements
  • Documentation updates

Testing

  • All existing tests pass with the refactored implementation
  • New comprehensive test suite provides better coverage of GLM-HMM functionality
  • Tests now run faster due to optimizations

Statistics

  • 29 files changed: 4,868 insertions(+), 2,715 deletions(-)
  • Net addition of ~2,150 lines despite significant refactoring

Breaking Changes

⚠️ Parameter structure change: GLM-HMM now uses glm_params: Tuple[Array, Array] instead of projection_weights: Array. Code using the GLM-HMM API will need to be updated accordingly.

@codecov-commenter
Copy link

codecov-commenter commented Nov 7, 2025

Codecov Report

❌ Patch coverage is 87.42138% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.26%. Comparing base (dc72fcd) to head (ee2aac4).
⚠️ Report is 134 commits behind head on development.

Files with missing lines Patch % Lines
src/nemos/base_regressor.py 70.37% 8 Missing ⚠️
src/nemos/regularizer.py 25.00% 6 Missing ⚠️
src/nemos/type_casting.py 77.77% 2 Missing ⚠️
src/nemos/glm_hmm/expectation_maximization.py 97.56% 1 Missing ⚠️
src/nemos/inverse_link_function_utils.py 97.91% 1 Missing ⚠️
src/nemos/io/io.py 0.00% 1 Missing ⚠️
src/nemos/validation.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           development     #429      +/-   ##
===============================================
- Coverage        72.47%   70.26%   -2.21%     
===============================================
  Files               97      104       +7     
  Lines             9413     9901     +488     
===============================================
+ Hits              6822     6957     +135     
- Misses            2591     2944     +353     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants