Skip to content

Commit 110430f

Browse files
committed
Add check to disable armsve on Apple M1.
- (cherry picked from commit c803b03) Fix auto-detection of firestorm (Apple M1). - (cherry picked from commit 2dd692b) Added Discord documentation (#677) Details: - Added a docs/Discord.md markdown document that walks the reader through creating a Discord account, obtaining the invite link, and using the link to join the BLIS Discord server. - Updated README.md to reference the new Discord.md document in multiple places, including via the official Discord logo (used with explicit permission from representatives at Discord Inc.). - (cherry picked from commit 88105db) Shuffled checked properties in bli_l3_check.c. (#676) Details: - Added certain checks for matrix structure to the level-3 operations' _check() functions, and slightly reorganized existing checks. - (cherry picked from commit 23f5b8d) CREDITS file update. Details: - This attribution was intended to go in PR #647. - (cherry picked from commit 9453e0f) Reinstate sanity check in bli_pool_finalize. (#671) Details: - Added a reinit argument to bli_pool_finalize(). This bool will signal whether or not the function is being called from bli_pool_reinit(). If it is not being called from _reinit(), we can safely check to confirm that .top_index == 0 (i.e., all blocks have been checked in). But if it *is* being called from _reinit(), then that check will be skipped since one of the predicted use cases for bli_pool_reinit() anticipates that some blocks are (probably) checked out when the pool_t is reinitialized. - Updated existing invocations of bli_pool_finalize() to pass in either FALSE (from bli_apool_free_block() or bli_pba_finalize_pools()) or TRUE (from bli_pool_reinit()) for the new reinit argument. - (cherry picked from commit 76a23bd) Fix some bugs in bli_pool.c (#670) Details: - Add a check for premature pool exhaustion when checking in blocks via bli_pool_checkin_block(). This detects "double-free" and other bad conditions that don't necessarily result in a segfault. - Make sure to copy all block pointers when growing the pool size. Previously, checked-out block pointers (which are guaranteed to be set to NULL) were not being copied, leading to the presence of uninitialized data. - (cherry picked from commit 63470b4) Add AddressSanitizer (-fsanitize=address) option. (#669) Details: - Added support for AddressSanitizer (ASan), a compiler-integrated memory error detector. The option (disabled by default) enables compiling and linking with the -fsanitize=address flag supported by clang, gcc, and probably others. This flag is employed during compilation of all BLIS source files *except* for optimized kernels, which are exempted because ASan usually requires an extra register, which violates the constraints for many gemm microkernels. - Minor whitespace, comment, ordering, and configure help text updates. - (cherry picked from commit 42d0e66) Add consistent NaN/Inf handling in sumsqv. (#668) Details: - Changed sumsqv implementation as follows: - If there is a NaN (either real or imaginary), then return a sum of NaN and unit scale. - Else, if there is an Inf (either real or imaginary), then return a sum of +Inf and unit scale. - Otherwise behave as normal. - (cherry picked from commit b861c71) Parameterized test/3 drivers via command line args. (#667) Details: - Rewrote the drivers in test/3, the Makefile, and the runme.sh script so that most of the important parameters, including parameter combo, datatype, storage combo, induced method, problem size range, dimension bindings, number of repeats, and alpha/beta values can be passed in via command line arguments. (Previously, most of these parameters were hard-coded into the driver source, except a few that were hard-coded into the Makefile.) If no argument is given for any particular option, it will be assigned a sane default. Either way, the values employed at runtime will be printed to stdout before the performance data in a section that is commented out with '%' characters (which is used by matlab and octave for comments), unless the -q option is given, in which case the driver will proceed quietly and output only performance data. Each driver also provides extensive help via the -h option, with the help text tailored for the operation in question (e.g. gemm, hemm, herk, etc.). In this help text, the driver reminds the user which implementation it was linked to (e.g. blis, openblas, vendor, eigen). Thanks to Jeff Diamond for suggesting this CLI-based reimagining of the test/3 drivers. - In the test/3 drivers: converted cpp macro string constants, as well as two string literals (for the opname and pc_str) used in each test driver, to global (or static) const char* strings, and replaced the use of strncpy() for storing the results of the command line argument parsing with pointer copies from the corresponding strings in argv. This works because the argv array is guaranteed by the C99 standard to persist throughout the life of the program. This new approach uses less storage and executes faster. Thanks to Minh Quan Ho for recommending this change. - Renamed the IMP_STR cpp macro that gets defined on the command line, via the test/3/Makefile, to IMPL_STR. - Updated runme.sh to set the problem size ranges for single-threaded and multithreaded execution independently from one another, as well as on a per-system basis. - Added a 'quiet' variable to runme.sh that can easily toggle quiet mode for the test drivers' output. - Very minor typecast fix in call to bli_getopt() in bli_utils.c. - In bli_getopt(), changed the nextchar variable from being a local static variable to a field of the getopt_t state struct. (Not sure why it was ever declared static to begin with.) - Other minor changes to bli_getopt() to accommodate the rewritten test drivers' command line parsing needs. - (cherry picked from commit ee81efc) Allow test/3 drivers to use default ind_t method. (#804) Details: - Previously, the standalone performance drivers in test/3 were written under the assumption that the user would want to explicitly test either native execution *or* 1m. But because the accompanying runme.sh script defaults to passing "native" in for the -i command line option (which explicitly sets the induced method type), running the script without modification causes the test drivers to use slow reference microkernels on systems where native complex-domain microkernels are not registered -- which will yield poor performance for complex-domain level-3 operations. Furthermore, even if a user was aware of this, the test drivers did not support any single value for the -i option that would test BLIS using the library's default behavior -- that is, using 1m on systems where it is needed and native execution on systems that have native microkernels implemented and registered. - This commit addresses the aforementioned issue by supporting a new value for the -i option: "auto". The "auto" value causes the driver to avoid explicitly setting the induced method altogether, leaving BLIS's default behavior in place. This "auto" option is also now the default setting within the runme.sh script. Thanks to Leick Robinson for finding and reporting this issue. - Also added support for "nat" as a shorthand for "native", which the help text already (erroneously) claimed was supported. - (cherry picked from commit fd1a7e3) Use "-i auto" by default in test/3 drivers. Details: - Request default induced method behavior of BLIS via "-i auto" when running the standalone performance drivers in test/3 via the runme.sh script present in that directory. (Previously, the runme.sh script would use "-i native" by default.) This change was originally intended for fd1a7e3. - (cherry picked from commit cad5149)
1 parent 3e727c0 commit 110430f

27 files changed

+2510
-932
lines changed

CREDITS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ but many others have contributed code and feedback, including
3737
Roman Gareev @gareevroman
3838
Richard Goldschmidt @SuperFluffy
3939
Chris Goodyer
40+
Alexander Grund @Flamefire
4041
John Gunnels @jagunnels (IBM, T.J. Watson Research Center)
4142
Ali Emre Gülcü @Lephar
4243
Jeff Hammond @jeffhammond (Intel)

Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1161,6 +1161,7 @@ showconfig: check-env
11611161
@echo "install includedir: $(INSTALL_INCDIR)"
11621162
@echo "install sharedir: $(INSTALL_SHAREDIR)"
11631163
@echo "debugging status: $(DEBUG_TYPE)"
1164+
@echo "enable AddressSanitizer? $(MK_ENABLE_ASAN)"
11641165
@echo "enabled threading model(s): $(THREADING_MODEL)"
11651166
@echo "enable BLAS API? $(MK_ENABLE_BLAS)"
11661167
@echo "enable CBLAS API? $(MK_ENABLE_CBLAS)"

README.md

Lines changed: 25 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33
[![Build Status](https://api.travis-ci.com/flame/blis.svg?branch=master)](https://app.travis-ci.com/github/flame/blis)
44
[![Build Status](https://ci.appveyor.com/api/projects/status/github/flame/blis?branch=master&svg=true)](https://ci.appveyor.com/project/shpc/blis/branch/master)
55

6+
[<img alt="Discord logo" title="Join us on Discord!" height="32px" src="docs/images/discord.svg" />](docs/Discord.md)
7+
68
Contents
79
--------
810

@@ -97,6 +99,17 @@ all of which are available for free via the [edX platform](http://www.edx.org/).
9799
What's New
98100
----------
99101

102+
* **Join us on Discord!** In 2021, we soft-launched our [Discord](https://discord.com/)
103+
server by privately inviting current and former collaborators, attendees of our BLIS
104+
Retreat, as well as other participants within the BLIS ecosystem. We've been thrilled by
105+
the results thus far, and are happy to announce that our new community is now open to
106+
the broader public! If you'd like to hang out with other BLIS users and developers,
107+
ask a question, discuss future features, or just say hello, please feel free to join us!
108+
We've put together a [step-by-step guide](docs/Discord.md) for creating an account and
109+
joining our cozy enclave. We even have a monthly "BLIS happy hour" event where people
110+
can casually come together for a video chat, Q&A, brainstorm session, or whatever it
111+
happens to unfold into!
112+
100113
* **Addons feature now available!** Have you ever wanted to quickly extend BLIS's
101114
operation support or define new custom BLIS APIs for your application, but were
102115
unsure of how to add your source code to BLIS? Do you want to isolate your custom
@@ -417,6 +430,9 @@ If/when you have time, we *strongly* encourage you to read the detailed
417430
walkthrough of the build system found in our [Build System](docs/BuildSystem.md)
418431
guide.
419432

433+
If you are still having trouble, you are welcome to [join us on Discord](docs/Discord.md)
434+
for further information and/or assistance.
435+
420436
Example Code
421437
------------
422438

@@ -500,6 +516,10 @@ empirically measured performance of `gemm` on select hardware architectures
500516
within BLIS and other BLAS libraries when performing matrix problems where one
501517
or two dimensions is exceedingly small.
502518

519+
* **[Discord](docs/Discord.md).** This document describes how to: create an
520+
account on Discord (if you don't already have one); obtain a private invite
521+
link; and use that invite link to join our BLIS server on Discord.
522+
503523
* **[Release Notes](docs/ReleaseNotes.md).** This document tracks a summary of
504524
changes included with each new version of BLIS, along with contributor credits
505525
for key features.
@@ -610,16 +630,15 @@ has Linux, OSX and Windows binary packages for x86_64.
610630
Discussion
611631
----------
612632

613-
You can keep in touch with developers and other users of the project by joining
614-
one of the following mailing lists:
633+
Most of the active discussions are now happening on our [Discord](https://discord.com/)
634+
server. Users and developers alike are welcome! Please see the
635+
[BLIS Discord guide](docs/Discord.md) for a walkthrough of how to join us.
636+
637+
You can also still stay in touch by using either of the following mailing lists:
615638

616639
* [blis-devel](https://groups.google.com/group/blis-devel): Please join and
617640
post to this mailing list if you are a BLIS developer, or if you are trying
618641
to use BLIS beyond simply linking to it as a BLAS library.
619-
**Note:** Most of the interesting discussions happen here; don't be afraid to
620-
join! If you would like to submit a bug report, or discuss a possible bug,
621-
please consider opening a [new issue](https://github.com/flame/blis/issues) on
622-
github.
623642

624643
* [blis-discuss](https://groups.google.com/group/blis-discuss): Please join and
625644
post to this mailing list if you have general questions or feedback regarding

build/config.mk.in

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,9 @@ LDFLAGS_PRESET := @ldflags_preset@
124124
# The level of debugging info to generate.
125125
DEBUG_TYPE := @debug_type@
126126

127+
# Whether to compile and link the AddressSanitizer library.
128+
MK_ENABLE_ASAN := @enable_asan@
129+
127130
# Whether operating system support was requested via --enable-system.
128131
ENABLE_SYSTEM := @enable_system@
129132

common.mk

Lines changed: 27 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,7 @@ get-noopt-cxxflags-for = $(strip $(CFLAGS_PRESET) \
118118
get-refinit-cflags-for = $(strip $(call load-var-for,COPTFLAGS,$(1)) \
119119
$(call get-noopt-cflags-for,$(1)) \
120120
-DBLIS_CNAME=$(1) \
121+
$(BUILD_ASANFLAGS) \
121122
$(BUILD_CPPFLAGS) \
122123
$(BUILD_SYMFLAGS) \
123124
-DBLIS_IN_REF_KERNEL=1 \
@@ -129,6 +130,7 @@ get-refkern-cflags-for = $(strip $(call load-var-for,CROPTFLAGS,$(1)) \
129130
$(call get-noopt-cflags-for,$(1)) \
130131
$(COMPSIMDFLAGS) \
131132
-DBLIS_CNAME=$(1) \
133+
$(BUILD_ASANFLAGS) \
132134
$(BUILD_CPPFLAGS) \
133135
$(BUILD_SYMFLAGS) \
134136
-DBLIS_IN_REF_KERNEL=1 \
@@ -137,12 +139,14 @@ get-refkern-cflags-for = $(strip $(call load-var-for,CROPTFLAGS,$(1)) \
137139

138140
get-config-cflags-for = $(strip $(call load-var-for,COPTFLAGS,$(1)) \
139141
$(call get-noopt-cflags-for,$(1)) \
142+
$(BUILD_ASANFLAGS) \
140143
$(BUILD_CPPFLAGS) \
141144
$(BUILD_SYMFLAGS) \
142145
)
143146

144147
get-frame-cflags-for = $(strip $(call load-var-for,COPTFLAGS,$(1)) \
145148
$(call get-noopt-cflags-for,$(1)) \
149+
$(BUILD_ASANFLAGS) \
146150
$(BUILD_CPPFLAGS) \
147151
$(BUILD_SYMFLAGS) \
148152
)
@@ -201,11 +205,14 @@ get-sandbox-cxxflags-for = $(strip $(call load-var-for,COPTFLAGS,$(1)) \
201205
# Define a separate function that will return appropriate flags for use by
202206
# applications that want to use the same basic flags as those used when BLIS
203207
# was compiled. (NOTE: This is the same as the $(get-frame-cflags-for ...)
204-
# function, except that it omits two variables that contain flags exclusively
205-
# for use when BLIS is being compiled/built: BUILD_CPPFLAGS, which contains a
206-
# cpp macro that confirms that BLIS is being built; and BUILD_SYMFLAGS, which
207-
# contains symbol export flags that are only needed when a shared library is
208-
# being compiled/linked.)
208+
# function, except that it omits a few variables that contain flags exclusively
209+
# for use when BLIS is being compiled/built:
210+
# - BUILD_CPPFLAGS, which contains a cpp macro that confirms that BLIS
211+
# is being built;
212+
# - BUILD_SYMFLAGS, which contains symbol export flags that are only
213+
# needed when a shared library is being compiled/linked; and
214+
# - BUILD_ASANFLAGS, which contains a flag that causes the compiler to
215+
# insert instrumentation for memory error detection.
209216
get-user-cflags-for = $(strip $(call load-var-for,COPTFLAGS,$(1)) \
210217
$(call get-noopt-cflags-for,$(1)) \
211218
)
@@ -563,6 +570,11 @@ ifeq ($(DEBUG_TYPE),sde)
563570
LDFLAGS := $(filter-out $(LIBMEMKIND),$(LDFLAGS))
564571
endif
565572

573+
# If AddressSanitizer is enabled, add the compiler flag to LDFLAGS.
574+
ifeq ($(MK_ENABLE_ASAN),yes)
575+
LDFLAGS += -fsanitize=address
576+
endif
577+
566578
# Specify the shared library's 'soname' field.
567579
# NOTE: The flag for creating shared objects is different for Linux and OS X.
568580
ifeq ($(OS_NAME),Darwin)
@@ -808,11 +820,19 @@ $(foreach c, $(CONFIG_LIST_FAM), $(eval $(call append-var-for,CXXLANGFLAGS,$(c))
808820
CPPROCFLAGS := -D_POSIX_C_SOURCE=200112L
809821
$(foreach c, $(CONFIG_LIST_FAM), $(eval $(call append-var-for,CPPROCFLAGS,$(c))))
810822

823+
# --- AddressSanitizer flags ---
824+
825+
ifeq ($(MK_ENABLE_ASAN),yes)
826+
BUILD_ASANFLAGS := -fsanitize=address
827+
else
828+
BUILD_ASANFLAGS :=
829+
endif
830+
811831
# --- Threading flags ---
812832

813833
# NOTE: We don't have to explicitly omit -pthread when --disable-system is given
814-
# since that option forces --enable-threading=none, and thus -pthread never gets
815-
# added to begin with.
834+
# since that option forces --enable-threading=single, and thus -pthread never
835+
# gets added to begin with.
816836

817837
CTHREADFLAGS :=
818838

configure

Lines changed: 66 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -224,12 +224,22 @@ print_usage()
224224
echo " "
225225
echo " --enable-mem-tracing, --disable-mem-tracing"
226226
echo " "
227-
echo " Enable (disable by default) output to stdout that traces"
227+
echo " Enable (disabled by default) output to stdout that traces"
228228
echo " the allocation and freeing of memory, including the names"
229229
echo " of the functions that triggered the allocation/freeing."
230230
echo " Enabling this option WILL NEGATIVELY IMPACT PERFORMANCE."
231231
echo " Please use only for informational/debugging purposes."
232232
echo " "
233+
echo " --enable-asan, --disable-asan"
234+
echo " "
235+
echo " Enable (disabled by default) compiling and linking BLIS"
236+
echo " framework code with the AddressSanitizer (ASan) library."
237+
echo " Optimized kernels are NOT compiled with ASan support due"
238+
echo " to limitations of register assignment in inline assembly."
239+
echo " WARNING: ENABLING THIS OPTION WILL NEGATIVELY IMPACT"
240+
echo " PERFORMANCE. Please use only for informational/debugging"
241+
echo " purposes."
242+
echo " "
233243
echo " -i SIZE, --int-size=SIZE"
234244
echo " "
235245
echo " Set the size (in bits) of internal BLIS integers and"
@@ -1325,6 +1335,17 @@ blacklistbu_add()
13251335
fi
13261336
}
13271337

1338+
blacklistos_add()
1339+
{
1340+
# Check whether we've already blacklisted the given sub-config so
1341+
# we don't output redundant messages.
1342+
if [ $(is_in_list "$1" "${config_blist}") == "false" ]; then
1343+
1344+
echowarn "The operating system does not support building '$1'; adding to blacklist."
1345+
config_blist="${config_blist} $1"
1346+
fi
1347+
}
1348+
13281349
blacklist_init()
13291350
{
13301351
config_blist=""
@@ -1979,6 +2000,13 @@ check_assembler()
19792000
fi
19802001
}
19812002

2003+
check_os()
2004+
{
2005+
if [[ "$(uname -s)" == "Darwin" && "$(uname -m)" == "arm64" ]]; then
2006+
blacklistos_add "armsve"
2007+
fi
2008+
}
2009+
19822010
try_assemble()
19832011
{
19842012
local cc cflags asm_src asm_base asm_bin rval
@@ -2451,6 +2479,9 @@ main()
24512479
debug_type=''
24522480
debug_flag=''
24532481

2482+
# A flag indicating whether AddressSanitizer should be used.
2483+
enable_asan='no'
2484+
24542485
# The system flag.
24552486
enable_system='yes'
24562487

@@ -2576,6 +2607,12 @@ main()
25762607
disable-debug)
25772608
debug_flag=0
25782609
;;
2610+
enable-asan)
2611+
enable_asan='yes'
2612+
;;
2613+
disable-asan)
2614+
enable_asan='no'
2615+
;;
25792616
enable-verbose-make)
25802617
enable_verbose='yes'
25812618
;;
@@ -2867,6 +2904,9 @@ main()
28672904
get_binutils_version
28682905
check_assembler
28692906

2907+
# Check if there is any incompatibility due to the operating system.
2908+
check_os
2909+
28702910
# Remove duplicates and whitespace from the blacklist.
28712911
blacklist_cleanup
28722912

@@ -3357,6 +3397,20 @@ main()
33573397
echo "${script_name}: no preset LDFLAGS detected."
33583398
fi
33593399

3400+
# Check if the verbose make flag was specified.
3401+
if [ "x${enable_verbose}" = "xyes" ]; then
3402+
echo "${script_name}: enabling verbose make output. (disable with 'make V=0'.)"
3403+
else
3404+
echo "${script_name}: disabling verbose make output. (enable with 'make V=1'.)"
3405+
fi
3406+
3407+
# Check if the ARG_MAX hack was requested.
3408+
if [ "x${enable_arg_max_hack}" = "xyes" ]; then
3409+
echo "${script_name}: enabling ARG_MAX hack."
3410+
else
3411+
echo "${script_name}: disabling ARG_MAX hack."
3412+
fi
3413+
33603414
# Check if the debug flag was specified.
33613415
if [ -n "${debug_flag}" ]; then
33623416
if [ "x${debug_type}" = "xopt" ]; then
@@ -3373,29 +3427,24 @@ main()
33733427
echo "${script_name}: debug symbols disabled."
33743428
fi
33753429

3376-
# Check if the verbose make flag was specified.
3377-
if [ "x${enable_verbose}" = "xyes" ]; then
3378-
echo "${script_name}: enabling verbose make output. (disable with 'make V=0'.)"
3379-
else
3380-
echo "${script_name}: disabling verbose make output. (enable with 'make V=1'.)"
3381-
fi
3382-
3383-
# Check if the ARG_MAX hack was requested.
3384-
if [ "x${enable_arg_max_hack}" = "xyes" ]; then
3385-
echo "${script_name}: enabling ARG_MAX hack."
3430+
# Check if the AddressSanitizer flag was specified.
3431+
if [ "x${enable_asan}" = "xyes" ]; then
3432+
echo "${script_name}: enabling AddressSanitizer support (except for optimized kernels)."
33863433
else
3387-
echo "${script_name}: disabling ARG_MAX hack."
3434+
enable_asan='no'
3435+
echo "${script_name}: AddressSanitizer support disabled."
33883436
fi
33893437

3390-
enable_shared_01=1
33913438
# Check if the static lib flag was specified.
33923439
if [ "x${enable_static}" = "xyes" -a "x${enable_shared}" = "xyes" ]; then
33933440
echo "${script_name}: building BLIS as both static and shared libraries."
3441+
enable_shared_01=1
3442+
elif [ "x${enable_static}" = "xno" -a "x${enable_shared}" = "xyes" ]; then
3443+
echo "${script_name}: building BLIS as a shared library (static library disabled)."
3444+
enable_shared_01=1
33943445
elif [ "x${enable_static}" = "xyes" -a "x${enable_shared}" = "xno" ]; then
33953446
echo "${script_name}: building BLIS as a static library (shared library disabled)."
33963447
enable_shared_01=0
3397-
elif [ "x${enable_static}" = "xno" -a "x${enable_shared}" = "xyes" ]; then
3398-
echo "${script_name}: building BLIS as a shared library (static library disabled)."
33993448
else
34003449
echo "${script_name}: Both static and shared libraries were disabled."
34013450
echo "${script_name}: *** Please enable one (or both) to continue."
@@ -3917,7 +3966,7 @@ main()
39173966
# Create a #define for the configuration family (config_name).
39183967
uconf=$(echo ${config_name} | tr '[:lower:]' '[:upper:]')
39193968
config_name_define="#define BLIS_FAMILY_${uconf}\n"
3920-
3969+
39213970
# Create a list of #defines, one for each configuration in config_list.
39223971
config_list_defines=""
39233972
for conf in ${config_list}; do
@@ -4012,6 +4061,7 @@ main()
40124061
| sed -e "s/@libpthread@/${libpthread_esc}/g" \
40134062
| sed -e "s/@cflags_preset@/${cflags_preset_esc}/g" \
40144063
| sed -e "s/@ldflags_preset@/${ldflags_preset_esc}/g" \
4064+
| sed -e "s/@enable_asan@/${enable_asan}/g" \
40154065
| sed -e "s/@debug_type@/${debug_type}/g" \
40164066
| sed -e "s/@enable_system@/${enable_system}/g" \
40174067
| sed -e "s/@threading_model@/${threading_model}/g" \

0 commit comments

Comments
 (0)