squarebrackets: Subset Methods as Alternatives to the Square Brackets Operators for Programming
Provides subset methods (supporting both atomic and recursive S3
classes) that may be more convenient alternatives to the [
and [<-
operators, whilst maintaining similar performance.
Some nice properties of these methods include, but are not limited to, the following:
- The
[
and[<-
operators use different rule-sets for different data.frame-like types (data.frames, data.tables, tibbles, tidytables, etc.). The ‘squarebrackets’ methods use the same rule-sets for the different data.frame-like types. - Performing dimensional subset operations on an array using
[
and[<-
, requires a-priori knowledge on the number of dimensions the array has. The ‘squarebrackets’ methods work on any arbitrary dimensions without requiring such prior knowledge. - When selecting names with the
[
and[<-
operators, only the first occurrence of the names are selected in case of duplicate names. The ‘squarebrackets’ methods always perform on all names in case of duplicates, not just the first. - The
[[
and[[<-
operators allow operating on a recursive subset of a nested list. But these only operate on a single recursive subset, and are not vectorized for multiple recursive subsets of a nested list at once. ‘squarebrackets’ provides a way to reshape a nested list into a recursive matrix, thereby allowing vectorized operations on recursive subsets of such a nested list. - The
[<-
operator only supports copy-on-modify semantics for most classes. The ‘squarebrackets’ methods provides explicit pass-by-reference and pass-by-value semantics, whilst still respecting things like binding-locks and mutability rules. - ‘squarebrackets’ supports index-less sub-set operations, which is
more memory efficient for
long vectors
than sub-set operations using the[
and[<-
operators.
To get started see ?squarebrackets_help
One can install ‘squarebrackets’ from GitHub like so:
remotes::install_github("https://github.com/tony-aw/squarebrackets")
Special care has been taken to make sure the function names are clear, and that the function names are unlikely to conflict with core R, the recommended R packages, the rstudioapi package, or major packages from the fastverse. So one can attach the package - thus exposing its functions to the namespace - using:
library(squarebrackets)
- 10 March 2024: First GitHub upload - Package is very much experimental.
- 12 March 2024: Changed the introduction help page a bit, added
dt_setadd()
, and added tests for alldt_
- functions. There are slightly over 50,000 tests now. - 15 March 2024: Added the
sb_setRename()
method, and added tests for this method also. - 16 March 2024: Fixed a bug in the “rcpp_set_rowcol” source code. Tweaked the documentation here and there a bit. Improved the tests a bit.
- 17 March 2024: Added more tests, and tweaked the documentation a bit.
- 19 March 2024: Some methods/functions did not support mutable_atomic
type “complex”; this is now fixed. Added support for mutable_atomic
type “raw”. Added tests for atomic type handling. Added the functions
ma_setv()
, andcouldb.mutable_atomic()
. Added the options “sb.rat” and “sb.chkdup”; argumentchkdup
is now also set toFALSE
by default. Added more badges to the documentation. - 20 March 2024: The user can now also specify
coe = TRUE
insb_mod.data.frame()
. - 24 March 2024: Methods are now split between methods for non-recursive
objects (
sb_
), and methods for recursive objects (sb2_
). - 26 March 2024: Replaced
seq_rec()
withseq_rec2()
. - 27 March 2024: Added
dt_setreorder()
, and added tests for this also. ‘abind’ now as a dependency, and ‘abind’ based code removed, as it is redundant. - 29 March 2024: Added
sb2_before.array()
andsb2_after.array()
, and added tests for these also. Added tests for data.frame-like coercion types. Tweaked the documentation here and there a bit. - 30 March 2024: Removed the separate
NA
checks, as they are redundant. Fixed some linguistic mistakes in the documentation. - 1 April 2024: Removed
sb_coe()
but keptsb2_coe()
. Addedinv
argument tosb_mod()
/sb2_mod()
,sb_set()
/sb2_set()
, andsb2_coe()
, and added tests for these. Addedidx1()
for Copy-On-Modification Substitution, and added tests foridx1()
also.Fixed a bug in character subset ordering in thesb
/sb2
_mod
/set
/coe
- generic methods. Fixed a bug in the introduction message. Added even more tests. Addedidx1_dim()
, and added tests for these also. - 5 April 2024: Replaced
idx1
/idx0
withidx()
. - 18 May 2024: Added a few tests for the
idx()
method (need to add more). Fixed the export pattern expressions in the Namespace file. Adjusted the documentation. - 26 May 2024: Removed
sb2_coe()
, as it is redundant. - 6 June 2024: Removed the
sb(2)_before/after
methods in favour of the the newbind_
/bind2_
implementations. Added thelst_
functions. Added theoptions
help page. - 30 June 2024: Re-written internal code for arrays. Added support for
backward indexing via Complex Vector indices. Added more tests.
Replaced
seq_names()
with the new and far more flexibleidx_r()
function. - 31 August 2024: Made the tests more efficient. Removed separate method dispatch for factors, as using the default atomic vector method dispatch is sufficient for factors.
- 7 September 2024: Incorporated some ALTREP functionality into the package.
- 15 September 2024: Replaced the
drop
argument withred
to avoid confusion with base R’s owndrop
mechanic. Small performance improvements forsub2ind()
andsb_set.array()
. - 26 September 2024: Overhauled how indexing with complex vectors work.
- 28 September 2024: Split
sb(2)_setRename()
intosb_setFlatnames()
,sb_setDimnames()
, andsb2_setVarnames()
. - 10 October 2024:
sb_mod()
now makes partial copies of data.frame-like objects instead of whole copies, for more memory efficiency. Also removed the oldsb_str()
andsb_a()
functions. Renamedci_seq()
tocp_seq()
(in preparation for the next update). - 19 October 2024: Removed the renaming methods (
sb_setRename
), andseq_rec2()
. Addedslcseq_
. - 5 November 2024: Renamed
slcseq_
toslice_
. Re-organized the documentation a bit. Fixed examples of theci
andtci
help pages. Addedbind_mat()
andbind2_mat()
. Addedndims()
. - 14 November 2024: Performance improvement of
match_all()
. - 21 November 2024: Improved the documentation. Slightly tweaked array
argument usage. Added
sticky
option. Brought back the renaming methods. Changed behaviour of theuse.names
argument inlst_untree()
. - 24 November 2024: Matrices now use the same API as arrays. Adjusted the documentation accordingly. Cleaned up the internal code a bit.
- 30 November 2024: The binding implementations can now bind mixtures of atomic and recursive objects.
- 5 December 2024: replaced the
_rm
post-fixes with_wo
in all methods, to avoid confusion. Coercion for data.frame-like objects now happens automatically, and only when needed, in thesb2_mod()
method, and updated the documentation accordingly. Slightly re-organized the documentation.