Skip to content

Releases: Alipsa/matrix

Matrix Datasets v2.2.0

10 May 18:21

Choose a tag to compare

  • Rdatasets.overview() is now lazy — no network I/O on class loading; added Rdatasets.refresh() to clear the cache
  • Add Rdatasets.fetchData(String packageSlashItem) single-argument overload (e.g. fetchData('datasets/iris'))
  • Add Rdatasets.search(String text) to filter the overview by Item or Title (case-insensitive)
  • Add Dataset.names(), Dataset.mapNames(), and Dataset.load(String) discoverability helpers
  • Add Dataset.mapRegions(String) to list distinct region values for a map dataset
  • Fix: Dataset.iris() no longer applies a phantom Id: Integer conversion (the column does not exist in the CSV)
  • Fix: FileUtil.getResourcePath() now throws FileNotFoundException instead of NullPointerException when a resource is not found
  • Dataset.mapData() now trims whitespace from the dataset name and includes valid names in the error message
  • Upgrade dependencies: Groovy 5.0.5 → 5.0.6, jsoup 1.22.1 → 1.22.2

Matrix Spreadsheet v2.4.0

08 May 14:26

Choose a tag to compare

Refactoring, code quality, and usability improvements

Breaking Changes

  • rename ValueExtractor.getDouble() to getBigDecimal() and update all subclasses (FExcelValueExtractor) accordingly

New Features

  • add whole-sheet convenience imports with auto-detected dimensions: SpreadsheetImporter.importSpreadsheet(file, sheetNumber) and importSpreadsheet(file, sheetName)
  • add File-accepting overloads to SpreadsheetImporter matching SpreadsheetWriter's API shape

Bug Fixes

  • remove dead I/O and wasted workbook allocation in FExcelExporter.exportExcel when target file already exists
  • replace undeclared commons-io transitive dependency in FOdsImporter with Groovy's native is.bytes
  • improve error messages when a named sheet does not exist (names the missing sheet instead of generic "No value present")
  • standardise rounding mode in TableUtil.round to HALF_EVEN (banker's rounding) across all numeric column types

Code Quality

  • add class-level GroovyDoc and fix @param mismatches in SpreadsheetImporter
  • add GroovyDoc to SpreadsheetWriter.writeSheets(Map)
  • fix stale/misleading GroovyDoc on FExcelExporter.exportExcel overloads
  • make SpreadsheetImporter.validateNotNull private
  • add @CompileStatic to ValueExtractor
  • replace generic Exception throws in FileUtil with SpreadsheetImportException
  • document why FExcelExporter cannot use @CompileStatic (fastexcel internal GenericStyleSetter access)

Test Coverage

  • 135 tests passing (up from 105 in v2.3.0)

Matrix Tablesaw v0.3.0

05 May 19:39

Choose a tag to compare

Breaking Changes

  • Removed previously deprecated OdsReadOptions factory methods
    • OdsReadOptions.builder(Reader) and OdsReadOptions.builderFromString(String) have been removed as promised in the v0.2.2 release notes.
  • BigDecimalColumn arithmetic is now non-mutating by default.
    • plus(), subtract(), multiply(), and divide() return new columns instead of mutating the receiver.
    • Use the new addTo(), subtractBy(), multiplyBy(), and divideBy() methods for in-place mutation.
    • This makes Groovy operator overloading (+, -, *, /) behave intuitively.

New Features

  • Friendlier Gtable factory APIs
    • Gtable.create(Map) infers column types from the first non-null value in each list.
    • Gtable.create(Map, Map<String, ColumnType>) allows named type overrides while inferring the rest.
    • All map-based factories now validate that every list has the same length and throw a clear IllegalArgumentException on mismatch.
  • Table-level normalization convenience
    • Gtable.normalizeMinMax(columnName, outputColumnName?, decimals?)
    • Gtable.normalizeMean(columnName, outputColumnName?, decimals?)
    • Gtable.normalizeStdScale(columnName, outputColumnName?, decimals?)
    • Gtable.normalizeLog(columnName, outputColumnName?, decimals?)
    • Supports DoubleColumn, FloatColumn, and BigDecimalColumn.
    • Non-destructive by default (returns a new Gtable). Omit outputColumnName to replace the source column.
  • Explicit unsupported-column handling in Matrix → Tablesaw conversion
    • TableUtil.toTablesaw(Matrix) now throws IllegalArgumentException for unsupported column types instead of silently skipping them.
    • TableUtil.toTablesaw(Matrix, boolean skipUnsupported) provides an explicit opt-in to skip unsupported columns.

Bug Fixes

  • Preserve Matrix type metadata during Tablesaw conversion
    • TableUtil.classForColumnType now compares against ColumnType constants and BigDecimalColumnType.instance() directly, fixing cases where type metadata was lost.
  • Fix ODS missing cell handling
    • Null cells in ODS spreadsheets are now imported as missing values instead of the literal string "null".
  • Fix XLSX DateTime export
    • LOCAL_DATE_TIME columns now preserve both date and time components when written to XLSX.
  • Gtable.copy() now deep-copies columns
    • Previously copy() reused the original column objects, allowing mutations to leak back to the source table. It now creates independent column copies.
  • XmlReader now throws RuntimeIOException on parse failures
    • DocumentException from dom4j was previously wrapped in a raw RuntimeException; it is now consistently wrapped in RuntimeIOException.
  • BigDecimalAggregateFunctions.cv guards against zero mean
    • Dividing by a zero mean now throws a clear IllegalArgumentException instead of an opaque ArithmeticException.

Documentation

  • Updated readme.md with current dependency guidance (use matrix-bom or matrix-all), quick examples for conversion, Gtable factories, BigDecimal arithmetic, and normalization.
  • Fixed incorrect BOM version references in readme.md (was 3.7.0, corrected to 2.5.0).
  • Fixed GroovyDoc typos (extansionextension, tgbleGtable) and added missing method documentation to public API surface in Gtable.groovy.
  • Added missing Javadoc to BigDecimalColumn.add(BigDecimalColumn).

Code Quality

  • BigDecimalColumnType.INSTANCE is now final.
  • Extracted assertSameSize(BigDecimalColumn) to eliminate duplicated size-check logic across BigDecimalColumn arithmetic methods.

Build & Publishing

  • Corrected POM url and added a module-local LICENSE file (Apache License 2.0).
  • Updated license metadata in published POM from MIT to Apache 2.0 to align with Tablesaw licensing.

Dependency Updates

  • com.github.miachm.sods:SODS 1.8.2 -> 1.8.3

matrix-gsheets-0.2.0

03 May 15:54

Choose a tag to compare

Breaking Changes

  • Renamed authentication classes: BqAuthenticatorGsAuthenticator, BqAuthUtilsGsAuthUtils
    • See docs/0.1-0.2-MIGRATION.md for detailed migration guide with before/after examples

New Features

  • GsheetsWriter.update(String, String, Matrix, ...): Write Matrix data to an existing spreadsheet and range
  • GsheetsWriter.spreadsheetUrl(String): Convenience helper that returns the Google Sheets edit URL for a spreadsheet ID

Improvements

  • Eliminated duplicated service-setup code: Extracted buildSheetsService() helpers in both GsheetsReader and GsheetsWriter
  • Eliminated duplicated utilities in GsheetsWriter: Removed sanitizeSheetName(), toCell(), and MAX_SHEET_NAME_LENGTH — now delegates to GsUtil
  • Fixed sheet-name extraction: When a range has no ! prefix (e.g., A1:D10), the matrix name defaults to '' instead of the raw range string
  • Narrowed exception handling: GsUtil.getSheetNames(String, Sheets) now catches IOException specifically and declares throws IOException
  • Improved numeric precision: GsConverter.asLocalDate(Number) uses longValue() instead of intValue() for day counts
  • Idiomatic Groovy cleanup: Removed ~20 unnecessary return keywords from final expressions; replaced Java-style new ArrayList<>() + .add() with [] literals and << / collect
  • Code quality: Removed commented-out tokeninfo URL from GsAuthUtils; converted inline usage docs to proper GroovyDoc
  • Fixed GsAuthenticator log indentation: log.info call is now correctly inside the if (verbose) block
  • Fixed static field ordering: Moved PROP_USER_HOME above ADC_FILE_PATH to eliminate initialization-order hazard
  • Added missing return type declarations: GsConverter.asLocalTime(Object)LocalTime, GsConverter.asSerial(Date)BigDecimal
  • Added historical-naming GroovyDoc notes to GsAuthenticator and GsAuthUtils explaining the Bq prefix

Test Changes

  • Added 5 positive-case tests for GsConverter.toSerials (LocalDates, LocalDateTimes, LocalTimes, mixed types, empty list)
  • Added 5 validation tests for GsheetsWriter.update() (null spreadsheetId, null range, null matrix, empty matrix, no rows)
  • Moved 16 tests (sanitizeSheetName and toCell coverage) from GsExporterTest to GsUtilTest where they belong
  • Removed brittle reflection-based tests from GsheetsReaderTest and GsheetsWriterTest

Dependency Updates

  • com.google.apis:google-api-services-drive v3-rev20260220-2.0.0 → v3-rev20260322-2.0.0
  • com.google.apis:google-api-services-sheets v4-rev20251110-2.0.0 → v4-rev20260213-2.0.0
  • org.mockito:mockito-core 5.22.0 → 5.23.0
  • org.mockito:mockito-junit-jupiter 5.22.0 → 5.23.0
  • se.alipsa.nexus-release-plugin:se.alipsa.nexus-release-plugin.gradle.plugin 2.1.1 → 2.1.2

Matrix Avro 0.3.0

01 May 17:44

Choose a tag to compare

  • matrix-avro now compiles Groovy statically by default via config/groovy/compileStatic.groovy; explicit @CompileStatic annotations were removed.
  • CodeNarc is now enforced for the module with ignoreFailures = false, and warnings were fixed.
  • Schema inference was hardened/refactored through ColumnProfile, especially for mutable/object-typed columns, decimal metadata, lists, maps, and record-like maps.
  • fixed pre-epoch local-timestamp-millis reads by using floor modulo for nanosecond remainders
  • made explicit timestamp-millis writes of LocalDateTime timezone-stable by interpreting them at UTC
  • aligned stream ownership with the public docs
    • InputStream read/schema overloads leave caller-owned streams open
    • OutputStream write overloads leave caller-owned streams open
  • removed the writer schema cache so mutated matrices cannot reuse stale schemas
  • added public schema inspection APIs through MatrixAvroReader.schema(...)
  • added convenience factories and shortcuts
    • AvroReadOptions.defaults() and AvroReadOptions.named(...)
    • AvroWriteOptions.defaults() and AvroWriteOptions.exactDecimals()
    • MatrixAvroWriter.writeExactDecimals(...) and writeExactDecimalBytes(...)
    • AvroSchemaDecl.decimalColumn(...), arrayOf(...), and mapOf(...)
  • tightened public validation for schema building and null options
  • refreshed README, tutorial, and cookbook examples for schema inspection and decimal-safe writes

Matrix SQL v2.4.0

30 Apr 20:16

Choose a tag to compare

SQL identifier handling

  • New SqlIdentifier utility class for safe quoting and rendering of table/column names containing spaces, mixed case, reserved words, punctuation, or embedded quotes.
  • All generated SQL (DDL, INSERT, UPDATE, DROP) now routes through SqlIdentifier, replacing ad-hoc string concatenation with proper double-quote escaping.
  • MatrixDbUtil.tableName() sanitises matrix names with a regex-based normaliser instead of targeted character replacements.

Prepared-statement convenience APIs

  • select(String, List, String) — parameterised SELECT returning a Matrix.
  • update(String, List) — parameterised UPDATE/INSERT/DELETE.
  • delete(String, List) — parameterised DELETE.
  • execute(String, List) — parameterised arbitrary SQL returning a Map<Integer, Object> of result sets and update counts.
  • insert(String, Matrix) — insert a Matrix into an explicitly named table.
  • update(String, Row) — overload that always throws IllegalArgumentException to prevent accidental unconstrained updates; use update(String, Row, String...) with match columns instead.

Managed (non-owning) connections

  • New constructors MatrixSql(Connection, DataBaseProvider) and MatrixSql(Connection, SqlTypeMapper) wrap an externally supplied connection. close() leaves externally supplied connections open and usable.

Factory offline fallback

  • MatrixSqlFactory.FALLBACK_VERSIONS map centralises pinned fallback versions for H2 and Derby.
  • createH2(), createDerby(), and generic create() fall back to the pinned version when Maven Central is unreachable instead of throwing.
  • Error message for unsupported providers now includes the dependency coordinates and a "no fallback version is configured" hint.

MatrixResultSet hardening

  • Guards (ensureOpen, ensureCurrentRow, checkedColumnIndex) prevent operations on closed, unpositioned, or out-of-range result sets.
  • All column accessors routed through readValue() helper for consistency.
  • updateRow() is a documented no-op for detached result sets.
  • Null-safe primitive getters return JDBC-specified defaults (0, false) when the value is null.
  • unwrap() and isWrapperFor() follow the strict JDBC contract.
  • Calendar-aware getDate, getTime, getTimestamp and getURL corrected.

MatrixDbUtil improvements

  • Default minimum sizes for VARCHAR (255), DECIMAL precision (38) and scale (10) when column scanning finds no data.
  • insert(Connection, String, Matrix, boolean) overload passes addQuotes through to generated SQL.

Build and dependency changes

  • Remove dependency-resolver dependency; replaced by maven-utils which now covers the same functionality.
  • Migrate inline dependency declarations to Gradle version catalog (libs.versions.toml).
  • Add groovier-junit test dependency for Groovy-friendly JUnit assertions.
  • Remove log4j-to-slf4j test runtime dependency.
  • Dependency upgrades:
    • se.alipsa.groovy:data-utils [2.0.4 -> 2.0.6]
    • se.alipsa:maven-utils (replaces maven-3.9.11-utils 1.1.0) [-> 1.4.1]
    • commons-io:commons-io [2.21.0 -> 2.22.0]

Documentation and test hygiene

  • Comprehensive GroovyDoc on all public methods and constructors.
  • Expanded README with runnable examples for all public workflows.
  • Replaced all println/System.err.println in tests with assertions or Logger.

Matrix Arff 0.2.1

30 Apr 21:24

Choose a tag to compare

  • Fix nominal sentinel values (?, empty string, %-prefixed) being written unquoted, causing lossy ARFF round-trips
  • Add ArffDateFormats utility to share strict (lenient=false), UTC, Locale.ROOT date formatter creation between reader and writer
  • Fix date parsing to reject invalid dates (e.g. 2026-02-31) instead of silently normalizing them
  • Fix published SCM URL in POM (matrix-arff/tree/mastertree/main/matrix-arff)
  • Apply module-wide @CompileStatic via compileStatic.groovy build configuration
  • Add regression tests for nominal sentinel round-trips, invalid date rejection, and UTC date output
  • Fixed all codenarc warnings and change the build to fail on any new warnings.

Matrix Parquet v0.5.0

29 Apr 13:03

Choose a tag to compare

  • Add SPI integration: MatrixParquetFormatProvider registers .parquet extension with Matrix SPI so Matrix.read(file) and matrix.write(file) work without explicit imports
  • Add ParquetReadOptions and ParquetWriteOptions typed options classes with describe() and fromMap() for runtime discovery and SPI use
  • Add builder API: MatrixParquetReader.builder() and MatrixParquetWriter.builder(matrix) as the recommended fluent interface
  • Add write(OutputStream) and write(Path) overloads to writer; add read(byte[]), read(URL), read(Path), read(InputStream) to builder
  • Add writeBytes() method to write to a byte array without a file
  • Strengthen ParquetWriteOptions.validate(): enforce precision > 0, scale >= 0, scale ≤ precision for both uniform and per-column decimal meta
  • Add decimalMeta per-column precision/scale validation in validateDecimalMeta (checks shape, range, and consistency)
  • Fix bug: resource leak in MatrixParquetReader — reader now wrapped in withCloseable to ensure streams are always closed
  • Fix bug: negative BigDecimal values padded with 0x00 instead of 0xFF causing sign bit corruption on read
  • Fix bug: timestamp type mapped to MILLIS instead of MICROS causing precision loss
  • Fix bug: deprecated BigDecimal.ROUND_HALF_UP replaced with RoundingMode.HALF_UP
  • Fix hasUniformPrecisionAndScale() semantics: now returns true only when both precision and scale are non-null
  • Enable @CompileStatic by default for all production sources
  • Enable CodeNarc with ignoreFailures=false; fix all pre-existing violations
  • Remove ivy dependency (was unused)
  • Make internal APIs private; simplify readFromInputStream
  • Migrate tests to @TempDir; remove manual temp-file cleanup boilerplate
  • Correct POM publication URLs: url, license.url, and scm.url now use tree/main paths (Maven Central convention)
  • Add README sections: "At a Glance" goal/entry-point table, "API Surface" reference, "Default Behavior" for naming and decimal defaults
  • Upgrade dependencies
    • org.apache.hadoop:hadoop-common [3.4.2 -> 3.4.3]
    • org.apache.hadoop:hadoop-mapreduce-client-core [3.4.2 -> 3.4.3]

Matrix Json v2.2.0

29 Apr 11:17

Choose a tag to compare

  • add fluent WriteBuilder API for JsonWriter: JsonWriter.write(matrix).indent().to(file)
  • add matrix name derivation from file/URL in JsonReader
  • add matrixName option to JsonReadOptions to override file-derived name
  • add type-conversion support (types, dateTimeFormat) to JsonReadOptions / JsonFormatProvider
  • add readString(String) convenience method to JsonReader
  • add missing Path and String write overloads for the formatter variant of JsonWriter
  • switch JsonWriter from Groovy's groovy.json.* to Jackson streaming API (O(columns) peak memory)
  • remove groovy-json build dependency (Jackson used for both reading and writing)
  • enable build-level @CompileStatic for all production code
  • add module-level CodeNarc configuration with ignoreFailures = false
  • fix NPE risk in writeString when columnFormatters is null
  • fix resource leak in JsonReader.read(File, Charset) (nested withCloseable)
  • fix missing directory/parent-dir guards in formatter write(File, ...) overload
  • deprecate all existing static methods on JsonWriter in favor of fluent API
  • change JSON float parsing to deserialize floating-point numbers as BigDecimal instead of Double so imported decimal values keep their exact textual precision
  • upgrade dependencies
    • com.fasterxml.jackson.core:jackson-core 2.21.0 -> 2.21.2
    • com.fasterxml.jackson.core:jackson-databind 2.21.0 -> 2.21.2

Matrix Stats v2.4.0

23 Apr 19:53

Choose a tag to compare

Native runtime cleanup and idiomatic Groovy API expansion

Runtime and Dependency Changes

  • Remove Apache Commons Math from the runtime dependency surface. commons-math3 is now test-only and used to validate native implementations.
  • Add EJML Simple as an internal implementation dependency for the public linear algebra facade.
  • Configure matrix-stats production Groovy compilation through the shared compile-static Gradle script.
  • Make CodeNarc fail the build instead of reporting non-blocking findings.

New Public APIs

Linear Algebra

  • Add se.alipsa.matrix.stats.linalg.Linalg for dense matrix operations:
    inverse, determinant, linear solves, real eigenvalues, and SVD.
  • Add Matrix and Grid adapters with Groovy-facing return types:
    Matrix, Grid<BigDecimal>, BigDecimal, List<BigDecimal>, and SvdResult.
  • Add LinalgSingularMatrixException and shared adapter utilities for singular and invalid matrix input handling.
  • Add compatibility classes under se.alipsa.matrix.stats.linear for native matrix algebra support.

Linear Interpolation

  • Add se.alipsa.matrix.stats.interpolation.Interpolation for public linear interpolation.
  • Support explicit (x, y, targetX) interpolation, evenly spaced series interpolation, and Matrix/Grid-backed column interpolation.
  • Reject extrapolation, unsorted domains, duplicate domain values, length mismatches, ragged grids, and non-numeric columns.
  • Keep spline logic internal to formula/GAM smooth-term support; public spline interpolation is not part of the 2.4.0 API.

Formula and Model-Frame Pipeline

  • Add R-style formula parsing and normalization support, including additive terms, intercept control, interactions, shorthand expansion, quoted identifiers, numeric transformations, poly(...), and s(...) smooth terms.
  • Add ModelFrame and ModelFrameResult for design-matrix construction from Matrix data.
  • Add a Groovy-native operator DSL for formulas using |, noIntercept, interaction(...), smooth(...), and I { ... }.
  • Add categorical treatment encoding, dot expansion, subset support, NA handling, weights, offsets, and external environment variable resolution.
  • Add NaAction and formula metadata classes to make downstream model fitting explicit.
  • Reject unsupported response forms, smooth-term interactions, and unsupported frame metadata instead of silently ignoring them.

Fit Registry and Formula-Based Regression

  • Add FitRegistry, FitMethod, FitOptions, and FitResult for named fit-method dispatch.
  • Add built-in lm, loess, and gam fit methods.
  • Add FitDsl convenience entry points so Groovy callers can write lm(data) { y | x + group }, loess(data) { y | x }, and gam(data) { y | smooth(time, 6) + group }.
  • Add MultipleLinearRegression, LmMethod, LoessMethod, and GamMethod.
  • Add LoessOptions and GamOptions with Groovy-facing numeric option surfaces.

Native Distributions

  • Add native NormalDistribution, ChiSquaredDistribution, and HypergeometricDistribution.
  • Extend TDistribution, FDistribution, and SpecialFunctions to remove Commons Math runtime usage.
  • Add Groovy-facing Number overloads returning BigDecimal for public scalar distribution APIs.
  • Add typed entry points for F-distribution ANOVA helpers to avoid unsafe generic overload dispatch.

Native Solvers and Optimization

  • Replace Commons Math optimizer/solver runtime usage with native implementations.
  • Add BrentSolver for one-dimensional bracketing root finding.
  • Update GoalSeek to use the native Brent-Dekker solver.
  • Add NelderMeadOptimizer for derivative-free multivariate minimization.
  • Add LinearProgramSolver for equality-form linear programs with non-negative variables.
  • Add UnivariateObjective and MultivariateObjective interfaces with Groovy-facing numeric bridges.

Idiomatic Groovy Numeric API Cleanup

  • Move public scalar result values toward BigDecimal and public numeric inputs toward Number.
  • Add NumericConversion to centralize finite-value checks, BigDecimal conversion, array/list conversion, alpha validation, and exact integer validation.
  • Add StatUtils and LeastSquaresKernel for small internal double-precision kernels where performance or algorithm constraints justify primitive arrays.
  • Replace duplicate numeric coercion and conversion logic across regression, interpolation, distributions, time-series, KDE, and solver code.
  • Add typed compatibility entry points for Johansen and F-distribution list/array APIs where JVM erasure makes same-name generic overloads unsafe.
  • Deprecate primitive/double-style identity accessors in selected result classes where Groovy-facing properties are now preferred.

Time-Series and Statistical Robustness

  • Extract shared time-series utility logic into TimeSeriesUtils.
  • Improve singular-matrix handling and error messages in time-series code.
  • Harden Johansen critical-value handling for unsupported variable counts.
  • Improve CCM library-size validation by rejecting non-integral numeric inputs instead of truncating silently.
  • Update ANOVA and contingency result APIs to accept Number alpha inputs.

Tests and Quality

  • Add coverage for formula parsing, model-frame construction, design matrices, spline basis expansion, fit registry, lm, loess, gam, multiple linear regression, interpolation, linalg, SVD, native distributions, native solvers, numeric conversion, and least-squares kernels.
  • Add direct unit tests for GroupEstimator.estimateNumberOfGroups and estimateKByElbow, covering both the double[][] and Groovy-facing List overloads, custom maxK/iterations, and error cases (too few points, too few distinct points).
  • Add benchmark-oriented tests for selected Groovy-facing paths versus retained primitive kernels.
  • Update tests for BigDecimal/Groovy-friendly assertions and numeric API behavior.
  • Increase coverage around null handling, weights, offsets, subset filtering, categorical encoding, exact integer validation, singular matrices, and solver convergence.

Documentation

  • Refresh matrix-stats README, tutorial, and cookbook docs for the 2.4.0 API surface.
  • Document the Commons Math runtime removal and EJML implementation detail.
  • Add examples for Linalg, Interpolation, formula/model-frame fitting, the Groovy formula DSL, fit convenience helpers, native distributions, and native solvers.
  • Refresh broader tutorial setup snippets to use the current BOM and Groovy versions.