Releases: Alipsa/matrix
Releases · Alipsa/matrix
Matrix Datasets v2.2.0
Rdatasets.overview()is now lazy — no network I/O on class loading; addedRdatasets.refresh()to clear the cache- Add
Rdatasets.fetchData(String packageSlashItem)single-argument overload (e.g.fetchData('datasets/iris')) - Add
Rdatasets.search(String text)to filter the overview by Item or Title (case-insensitive) - Add
Dataset.names(),Dataset.mapNames(), andDataset.load(String)discoverability helpers - Add
Dataset.mapRegions(String)to list distinct region values for a map dataset - Fix:
Dataset.iris()no longer applies a phantomId: Integerconversion (the column does not exist in the CSV) - Fix:
FileUtil.getResourcePath()now throwsFileNotFoundExceptioninstead ofNullPointerExceptionwhen a resource is not found Dataset.mapData()now trims whitespace from the dataset name and includes valid names in the error message- Upgrade dependencies: Groovy 5.0.5 → 5.0.6, jsoup 1.22.1 → 1.22.2
Matrix Spreadsheet v2.4.0
Refactoring, code quality, and usability improvements
Breaking Changes
- rename
ValueExtractor.getDouble()togetBigDecimal()and update all subclasses (FExcelValueExtractor) accordingly
New Features
- add whole-sheet convenience imports with auto-detected dimensions:
SpreadsheetImporter.importSpreadsheet(file, sheetNumber)andimportSpreadsheet(file, sheetName) - add
File-accepting overloads toSpreadsheetImportermatchingSpreadsheetWriter's API shape
Bug Fixes
- remove dead I/O and wasted workbook allocation in
FExcelExporter.exportExcelwhen target file already exists - replace undeclared
commons-iotransitive dependency inFOdsImporterwith Groovy's nativeis.bytes - improve error messages when a named sheet does not exist (names the missing sheet instead of generic "No value present")
- standardise rounding mode in
TableUtil.roundtoHALF_EVEN(banker's rounding) across all numeric column types
Code Quality
- add class-level GroovyDoc and fix
@parammismatches inSpreadsheetImporter - add GroovyDoc to
SpreadsheetWriter.writeSheets(Map) - fix stale/misleading GroovyDoc on
FExcelExporter.exportExceloverloads - make
SpreadsheetImporter.validateNotNullprivate - add
@CompileStatictoValueExtractor - replace generic
Exceptionthrows inFileUtilwithSpreadsheetImportException - document why
FExcelExportercannot use@CompileStatic(fastexcel internalGenericStyleSetteraccess)
Test Coverage
- 135 tests passing (up from 105 in v2.3.0)
Matrix Tablesaw v0.3.0
Breaking Changes
- Removed previously deprecated
OdsReadOptionsfactory methodsOdsReadOptions.builder(Reader)andOdsReadOptions.builderFromString(String)have been removed as promised in the v0.2.2 release notes.
- BigDecimalColumn arithmetic is now non-mutating by default.
plus(),subtract(),multiply(), anddivide()return new columns instead of mutating the receiver.- Use the new
addTo(),subtractBy(),multiplyBy(), anddivideBy()methods for in-place mutation. - This makes Groovy operator overloading (
+,-,*,/) behave intuitively.
New Features
- Friendlier Gtable factory APIs
Gtable.create(Map)infers column types from the first non-null value in each list.Gtable.create(Map, Map<String, ColumnType>)allows named type overrides while inferring the rest.- All map-based factories now validate that every list has the same length and throw a clear
IllegalArgumentExceptionon mismatch.
- Table-level normalization convenience
Gtable.normalizeMinMax(columnName, outputColumnName?, decimals?)Gtable.normalizeMean(columnName, outputColumnName?, decimals?)Gtable.normalizeStdScale(columnName, outputColumnName?, decimals?)Gtable.normalizeLog(columnName, outputColumnName?, decimals?)- Supports
DoubleColumn,FloatColumn, andBigDecimalColumn. - Non-destructive by default (returns a new Gtable). Omit
outputColumnNameto replace the source column.
- Explicit unsupported-column handling in Matrix → Tablesaw conversion
TableUtil.toTablesaw(Matrix)now throwsIllegalArgumentExceptionfor unsupported column types instead of silently skipping them.TableUtil.toTablesaw(Matrix, boolean skipUnsupported)provides an explicit opt-in to skip unsupported columns.
Bug Fixes
- Preserve Matrix type metadata during Tablesaw conversion
TableUtil.classForColumnTypenow compares againstColumnTypeconstants andBigDecimalColumnType.instance()directly, fixing cases where type metadata was lost.
- Fix ODS missing cell handling
- Null cells in ODS spreadsheets are now imported as missing values instead of the literal string
"null".
- Null cells in ODS spreadsheets are now imported as missing values instead of the literal string
- Fix XLSX DateTime export
LOCAL_DATE_TIMEcolumns now preserve both date and time components when written to XLSX.
- Gtable.copy() now deep-copies columns
- Previously
copy()reused the original column objects, allowing mutations to leak back to the source table. It now creates independent column copies.
- Previously
- XmlReader now throws RuntimeIOException on parse failures
DocumentExceptionfrom dom4j was previously wrapped in a rawRuntimeException; it is now consistently wrapped inRuntimeIOException.
- BigDecimalAggregateFunctions.cv guards against zero mean
- Dividing by a zero mean now throws a clear
IllegalArgumentExceptioninstead of an opaqueArithmeticException.
- Dividing by a zero mean now throws a clear
Documentation
- Updated
readme.mdwith current dependency guidance (usematrix-bomormatrix-all), quick examples for conversion, Gtable factories, BigDecimal arithmetic, and normalization. - Fixed incorrect BOM version references in
readme.md(was3.7.0, corrected to2.5.0). - Fixed GroovyDoc typos (
extansion→extension,tgble→Gtable) and added missing method documentation to public API surface inGtable.groovy. - Added missing Javadoc to
BigDecimalColumn.add(BigDecimalColumn).
Code Quality
BigDecimalColumnType.INSTANCEis nowfinal.- Extracted
assertSameSize(BigDecimalColumn)to eliminate duplicated size-check logic acrossBigDecimalColumnarithmetic methods.
Build & Publishing
- Corrected POM
urland added a module-localLICENSEfile (Apache License 2.0). - Updated license metadata in published POM from MIT to Apache 2.0 to align with Tablesaw licensing.
Dependency Updates
- com.github.miachm.sods:SODS 1.8.2 -> 1.8.3
matrix-gsheets-0.2.0
Breaking Changes
- Renamed authentication classes:
BqAuthenticator→GsAuthenticator,BqAuthUtils→GsAuthUtils- See
docs/0.1-0.2-MIGRATION.mdfor detailed migration guide with before/after examples
- See
New Features
GsheetsWriter.update(String, String, Matrix, ...): Write Matrix data to an existing spreadsheet and rangeGsheetsWriter.spreadsheetUrl(String): Convenience helper that returns the Google Sheets edit URL for a spreadsheet ID
Improvements
- Eliminated duplicated service-setup code: Extracted
buildSheetsService()helpers in bothGsheetsReaderandGsheetsWriter - Eliminated duplicated utilities in
GsheetsWriter: RemovedsanitizeSheetName(),toCell(), andMAX_SHEET_NAME_LENGTH— now delegates toGsUtil - Fixed sheet-name extraction: When a range has no
!prefix (e.g.,A1:D10), the matrix name defaults to''instead of the raw range string - Narrowed exception handling:
GsUtil.getSheetNames(String, Sheets)now catchesIOExceptionspecifically and declaresthrows IOException - Improved numeric precision:
GsConverter.asLocalDate(Number)useslongValue()instead ofintValue()for day counts - Idiomatic Groovy cleanup: Removed ~20 unnecessary
returnkeywords from final expressions; replaced Java-stylenew ArrayList<>()+.add()with[]literals and<</collect - Code quality: Removed commented-out tokeninfo URL from
GsAuthUtils; converted inline usage docs to proper GroovyDoc - Fixed
GsAuthenticatorlog indentation:log.infocall is now correctly inside theif (verbose)block - Fixed static field ordering: Moved
PROP_USER_HOMEaboveADC_FILE_PATHto eliminate initialization-order hazard - Added missing return type declarations:
GsConverter.asLocalTime(Object)→LocalTime,GsConverter.asSerial(Date)→BigDecimal - Added historical-naming GroovyDoc notes to
GsAuthenticatorandGsAuthUtilsexplaining theBqprefix
Test Changes
- Added 5 positive-case tests for
GsConverter.toSerials(LocalDates, LocalDateTimes, LocalTimes, mixed types, empty list) - Added 5 validation tests for
GsheetsWriter.update()(null spreadsheetId, null range, null matrix, empty matrix, no rows) - Moved 16 tests (
sanitizeSheetNameandtoCellcoverage) fromGsExporterTesttoGsUtilTestwhere they belong - Removed brittle reflection-based tests from
GsheetsReaderTestandGsheetsWriterTest
Dependency Updates
- com.google.apis:google-api-services-drive v3-rev20260220-2.0.0 → v3-rev20260322-2.0.0
- com.google.apis:google-api-services-sheets v4-rev20251110-2.0.0 → v4-rev20260213-2.0.0
- org.mockito:mockito-core 5.22.0 → 5.23.0
- org.mockito:mockito-junit-jupiter 5.22.0 → 5.23.0
- se.alipsa.nexus-release-plugin:se.alipsa.nexus-release-plugin.gradle.plugin 2.1.1 → 2.1.2
Matrix Avro 0.3.0
- matrix-avro now compiles Groovy statically by default via config/groovy/compileStatic.groovy; explicit @CompileStatic annotations were removed.
- CodeNarc is now enforced for the module with ignoreFailures = false, and warnings were fixed.
- Schema inference was hardened/refactored through ColumnProfile, especially for mutable/object-typed columns, decimal metadata, lists, maps, and record-like maps.
- fixed pre-epoch
local-timestamp-millisreads by using floor modulo for nanosecond remainders - made explicit
timestamp-milliswrites ofLocalDateTimetimezone-stable by interpreting them at UTC - aligned stream ownership with the public docs
InputStreamread/schema overloads leave caller-owned streams openOutputStreamwrite overloads leave caller-owned streams open
- removed the writer schema cache so mutated matrices cannot reuse stale schemas
- added public schema inspection APIs through
MatrixAvroReader.schema(...) - added convenience factories and shortcuts
AvroReadOptions.defaults()andAvroReadOptions.named(...)AvroWriteOptions.defaults()andAvroWriteOptions.exactDecimals()MatrixAvroWriter.writeExactDecimals(...)andwriteExactDecimalBytes(...)AvroSchemaDecl.decimalColumn(...),arrayOf(...), andmapOf(...)
- tightened public validation for schema building and null options
- refreshed README, tutorial, and cookbook examples for schema inspection and decimal-safe writes
Matrix SQL v2.4.0
SQL identifier handling
- New
SqlIdentifierutility class for safe quoting and rendering of table/column names containing spaces, mixed case, reserved words, punctuation, or embedded quotes. - All generated SQL (DDL, INSERT, UPDATE, DROP) now routes through
SqlIdentifier, replacing ad-hoc string concatenation with proper double-quote escaping. MatrixDbUtil.tableName()sanitises matrix names with a regex-based normaliser instead of targeted character replacements.
Prepared-statement convenience APIs
select(String, List, String)— parameterised SELECT returning a Matrix.update(String, List)— parameterised UPDATE/INSERT/DELETE.delete(String, List)— parameterised DELETE.execute(String, List)— parameterised arbitrary SQL returning aMap<Integer, Object>of result sets and update counts.insert(String, Matrix)— insert a Matrix into an explicitly named table.update(String, Row)— overload that always throwsIllegalArgumentExceptionto prevent accidental unconstrained updates; useupdate(String, Row, String...)with match columns instead.
Managed (non-owning) connections
- New constructors
MatrixSql(Connection, DataBaseProvider)andMatrixSql(Connection, SqlTypeMapper)wrap an externally supplied connection.close()leaves externally supplied connections open and usable.
Factory offline fallback
MatrixSqlFactory.FALLBACK_VERSIONSmap centralises pinned fallback versions for H2 and Derby.createH2(),createDerby(), and genericcreate()fall back to the pinned version when Maven Central is unreachable instead of throwing.- Error message for unsupported providers now includes the dependency coordinates and a "no fallback version is configured" hint.
MatrixResultSet hardening
- Guards (
ensureOpen,ensureCurrentRow,checkedColumnIndex) prevent operations on closed, unpositioned, or out-of-range result sets. - All column accessors routed through
readValue()helper for consistency. updateRow()is a documented no-op for detached result sets.- Null-safe primitive getters return JDBC-specified defaults (0, false) when the value is null.
unwrap()andisWrapperFor()follow the strict JDBC contract.- Calendar-aware
getDate,getTime,getTimestampandgetURLcorrected.
MatrixDbUtil improvements
- Default minimum sizes for VARCHAR (255), DECIMAL precision (38) and scale (10) when column scanning finds no data.
insert(Connection, String, Matrix, boolean)overload passesaddQuotesthrough to generated SQL.
Build and dependency changes
- Remove
dependency-resolverdependency; replaced bymaven-utilswhich now covers the same functionality. - Migrate inline dependency declarations to Gradle version catalog (
libs.versions.toml). - Add
groovier-junittest dependency for Groovy-friendly JUnit assertions. - Remove
log4j-to-slf4jtest runtime dependency. - Dependency upgrades:
- se.alipsa.groovy:data-utils [2.0.4 -> 2.0.6]
- se.alipsa:maven-utils (replaces maven-3.9.11-utils 1.1.0) [-> 1.4.1]
- commons-io:commons-io [2.21.0 -> 2.22.0]
Documentation and test hygiene
- Comprehensive GroovyDoc on all public methods and constructors.
- Expanded README with runnable examples for all public workflows.
- Replaced all
println/System.err.printlnin tests with assertions or Logger.
Matrix Arff 0.2.1
- Fix nominal sentinel values (
?, empty string,%-prefixed) being written unquoted, causing lossy ARFF round-trips - Add
ArffDateFormatsutility to share strict (lenient=false), UTC,Locale.ROOTdate formatter creation between reader and writer - Fix date parsing to reject invalid dates (e.g.
2026-02-31) instead of silently normalizing them - Fix published SCM URL in POM (
matrix-arff/tree/master→tree/main/matrix-arff) - Apply module-wide
@CompileStaticviacompileStatic.groovybuild configuration - Add regression tests for nominal sentinel round-trips, invalid date rejection, and UTC date output
- Fixed all codenarc warnings and change the build to fail on any new warnings.
Matrix Parquet v0.5.0
- Add SPI integration:
MatrixParquetFormatProviderregisters.parquetextension with Matrix SPI soMatrix.read(file)andmatrix.write(file)work without explicit imports - Add
ParquetReadOptionsandParquetWriteOptionstyped options classes withdescribe()andfromMap()for runtime discovery and SPI use - Add builder API:
MatrixParquetReader.builder()andMatrixParquetWriter.builder(matrix)as the recommended fluent interface - Add
write(OutputStream)andwrite(Path)overloads to writer; addread(byte[]),read(URL),read(Path),read(InputStream)to builder - Add
writeBytes()method to write to a byte array without a file - Strengthen
ParquetWriteOptions.validate(): enforceprecision > 0,scale >= 0,scale ≤ precisionfor both uniform and per-column decimal meta - Add
decimalMetaper-column precision/scale validation invalidateDecimalMeta(checks shape, range, and consistency) - Fix bug: resource leak in
MatrixParquetReader— reader now wrapped inwithCloseableto ensure streams are always closed - Fix bug: negative
BigDecimalvalues padded with0x00instead of0xFFcausing sign bit corruption on read - Fix bug: timestamp type mapped to
MILLISinstead ofMICROScausing precision loss - Fix bug: deprecated
BigDecimal.ROUND_HALF_UPreplaced withRoundingMode.HALF_UP - Fix
hasUniformPrecisionAndScale()semantics: now returns true only when bothprecisionandscaleare non-null - Enable
@CompileStaticby default for all production sources - Enable CodeNarc with
ignoreFailures=false; fix all pre-existing violations - Remove ivy dependency (was unused)
- Make internal APIs private; simplify
readFromInputStream - Migrate tests to
@TempDir; remove manual temp-file cleanup boilerplate - Correct POM publication URLs:
url,license.url, andscm.urlnow usetree/mainpaths (Maven Central convention) - Add README sections: "At a Glance" goal/entry-point table, "API Surface" reference, "Default Behavior" for naming and decimal defaults
- Upgrade dependencies
- org.apache.hadoop:hadoop-common [3.4.2 -> 3.4.3]
- org.apache.hadoop:hadoop-mapreduce-client-core [3.4.2 -> 3.4.3]
Matrix Json v2.2.0
- add fluent
WriteBuilderAPI forJsonWriter:JsonWriter.write(matrix).indent().to(file) - add matrix name derivation from file/URL in
JsonReader - add
matrixNameoption toJsonReadOptionsto override file-derived name - add type-conversion support (
types,dateTimeFormat) toJsonReadOptions/JsonFormatProvider - add
readString(String)convenience method toJsonReader - add missing
PathandStringwrite overloads for the formatter variant ofJsonWriter - switch
JsonWriterfrom Groovy'sgroovy.json.*to Jackson streaming API (O(columns) peak memory) - remove
groovy-jsonbuild dependency (Jackson used for both reading and writing) - enable build-level
@CompileStaticfor all production code - add module-level CodeNarc configuration with
ignoreFailures = false - fix NPE risk in
writeStringwhencolumnFormattersis null - fix resource leak in
JsonReader.read(File, Charset)(nestedwithCloseable) - fix missing directory/parent-dir guards in formatter
write(File, ...)overload - deprecate all existing static methods on
JsonWriterin favor of fluent API - change JSON float parsing to deserialize floating-point numbers as
BigDecimalinstead ofDoubleso imported decimal values keep their exact textual precision - upgrade dependencies
- com.fasterxml.jackson.core:jackson-core 2.21.0 -> 2.21.2
- com.fasterxml.jackson.core:jackson-databind 2.21.0 -> 2.21.2
Matrix Stats v2.4.0
Native runtime cleanup and idiomatic Groovy API expansion
Runtime and Dependency Changes
- Remove Apache Commons Math from the runtime dependency surface.
commons-math3is now test-only and used to validate native implementations. - Add EJML Simple as an internal
implementationdependency for the public linear algebra facade. - Configure
matrix-statsproduction Groovy compilation through the shared compile-static Gradle script. - Make CodeNarc fail the build instead of reporting non-blocking findings.
New Public APIs
Linear Algebra
- Add
se.alipsa.matrix.stats.linalg.Linalgfor dense matrix operations:
inverse, determinant, linear solves, real eigenvalues, and SVD. - Add Matrix and Grid adapters with Groovy-facing return types:
Matrix,Grid<BigDecimal>,BigDecimal,List<BigDecimal>, andSvdResult. - Add
LinalgSingularMatrixExceptionand shared adapter utilities for singular and invalid matrix input handling. - Add compatibility classes under
se.alipsa.matrix.stats.linearfor native matrix algebra support.
Linear Interpolation
- Add
se.alipsa.matrix.stats.interpolation.Interpolationfor public linear interpolation. - Support explicit
(x, y, targetX)interpolation, evenly spaced series interpolation, and Matrix/Grid-backed column interpolation. - Reject extrapolation, unsorted domains, duplicate domain values, length mismatches, ragged grids, and non-numeric columns.
- Keep spline logic internal to formula/GAM smooth-term support; public spline interpolation is not part of the 2.4.0 API.
Formula and Model-Frame Pipeline
- Add R-style formula parsing and normalization support, including additive terms, intercept control, interactions, shorthand expansion, quoted identifiers, numeric transformations,
poly(...), ands(...)smooth terms. - Add
ModelFrameandModelFrameResultfor design-matrix construction from Matrix data. - Add a Groovy-native operator DSL for formulas using
|,noIntercept,interaction(...),smooth(...), andI { ... }. - Add categorical treatment encoding, dot expansion, subset support, NA handling, weights, offsets, and external environment variable resolution.
- Add
NaActionand formula metadata classes to make downstream model fitting explicit. - Reject unsupported response forms, smooth-term interactions, and unsupported frame metadata instead of silently ignoring them.
Fit Registry and Formula-Based Regression
- Add
FitRegistry,FitMethod,FitOptions, andFitResultfor named fit-method dispatch. - Add built-in
lm,loess, andgamfit methods. - Add
FitDslconvenience entry points so Groovy callers can writelm(data) { y | x + group },loess(data) { y | x }, andgam(data) { y | smooth(time, 6) + group }. - Add
MultipleLinearRegression,LmMethod,LoessMethod, andGamMethod. - Add
LoessOptionsandGamOptionswith Groovy-facing numeric option surfaces.
Native Distributions
- Add native
NormalDistribution,ChiSquaredDistribution, andHypergeometricDistribution. - Extend
TDistribution,FDistribution, andSpecialFunctionsto remove Commons Math runtime usage. - Add Groovy-facing
Numberoverloads returningBigDecimalfor public scalar distribution APIs. - Add typed entry points for F-distribution ANOVA helpers to avoid unsafe generic overload dispatch.
Native Solvers and Optimization
- Replace Commons Math optimizer/solver runtime usage with native implementations.
- Add
BrentSolverfor one-dimensional bracketing root finding. - Update
GoalSeekto use the native Brent-Dekker solver. - Add
NelderMeadOptimizerfor derivative-free multivariate minimization. - Add
LinearProgramSolverfor equality-form linear programs with non-negative variables. - Add
UnivariateObjectiveandMultivariateObjectiveinterfaces with Groovy-facing numeric bridges.
Idiomatic Groovy Numeric API Cleanup
- Move public scalar result values toward
BigDecimaland public numeric inputs towardNumber. - Add
NumericConversionto centralize finite-value checks, BigDecimal conversion, array/list conversion, alpha validation, and exact integer validation. - Add
StatUtilsandLeastSquaresKernelfor small internal double-precision kernels where performance or algorithm constraints justify primitive arrays. - Replace duplicate numeric coercion and conversion logic across regression, interpolation, distributions, time-series, KDE, and solver code.
- Add typed compatibility entry points for
Johansenand F-distribution list/array APIs where JVM erasure makes same-name generic overloads unsafe. - Deprecate primitive/double-style identity accessors in selected result classes where Groovy-facing properties are now preferred.
Time-Series and Statistical Robustness
- Extract shared time-series utility logic into
TimeSeriesUtils. - Improve singular-matrix handling and error messages in time-series code.
- Harden Johansen critical-value handling for unsupported variable counts.
- Improve CCM library-size validation by rejecting non-integral numeric inputs instead of truncating silently.
- Update ANOVA and contingency result APIs to accept
Numberalpha inputs.
Tests and Quality
- Add coverage for formula parsing, model-frame construction, design matrices, spline basis expansion, fit registry,
lm,loess,gam, multiple linear regression, interpolation, linalg, SVD, native distributions, native solvers, numeric conversion, and least-squares kernels. - Add direct unit tests for
GroupEstimator.estimateNumberOfGroupsandestimateKByElbow, covering both thedouble[][]and Groovy-facingListoverloads, custommaxK/iterations, and error cases (too few points, too few distinct points). - Add benchmark-oriented tests for selected Groovy-facing paths versus retained primitive kernels.
- Update tests for BigDecimal/Groovy-friendly assertions and numeric API behavior.
- Increase coverage around null handling, weights, offsets, subset filtering, categorical encoding, exact integer validation, singular matrices, and solver convergence.
Documentation
- Refresh
matrix-statsREADME, tutorial, and cookbook docs for the 2.4.0 API surface. - Document the Commons Math runtime removal and EJML implementation detail.
- Add examples for
Linalg,Interpolation, formula/model-frame fitting, the Groovy formula DSL, fit convenience helpers, native distributions, and native solvers. - Refresh broader tutorial setup snippets to use the current BOM and Groovy versions.