1️⃣ 🐝 🏎️ (One Billion Row Challenge)

1bc is exactly what it named after. There are 1 billion row of data given in following formate. (city name; Temparature);

Hamburg;12.0
Bulawayo;8.9
Palembang;38.8
Hamburg;34.2
St. John's;15.2
Cracow;12.6
... etc. ...

And would like to calculate <min>/<avrage>/<max> output those in following formate.

Hamburg;12.0;23.1;34.2
Bulawayo;8.9;22.1;35.2
Palembang;38.8;39.9;41.0

More Details are at 1️⃣🐝🏎️1BRC

Levels	Noticeable changes	Notes	Run Time:-
level_0	Baseline implementation. Using `std::ifstream` to read the file and `std::map` to store.	Making copy of data is expensive.	In Millisecond :- 561356ms In Second :- 561s
level_1	Avoid making copy of created map and use `std::unordered_map` which is faster than map.	Unordered map uses hash table as a data structure compared to tree in map. Which makes it faster.	In Millisecond :- 319908ms In Second :- 319s
level_2	Using `mmap` to map data file in virtual memory and convert that to `std::string_view`. Avoid using `std::stof`	`mmap` is a system call which maps the file into physical memory space. Details are in Eureka	In Millisecond :- 186700ms In Second :- 186s
level_3	Using custom map which holds key value to an array also using `std::hash` with `uint_16` to get index in array.	Accessing array with pointer arthmetic would be faster.	In Millisecond :- 179987ms In Second :- 179s
level_4	Using `constexper` to create compile time array to calculate int value from string. Also using simple hash method to avoid `std::hash` completely.	Hash function is slow, I was calculating `uint32_t` size of hash only to convert it into `uint16_t`.	In Millisecond :- 157615ms In Second :- 157s
final	Using multithreading. Split the data files into smaller section and pass it to each thread.	After running multiple time seems 32 threads with 512mb section size gets best run time.	In Millisecond :- 27440ms In Second :- 25s Thread :- 32 Size :- 512

Eureka 🤯

mmap

Instruments

In MacOS instruments provide multiple profiling tools. I am using xctrace.

xctrace record --output . --template "Time Profiler" --time-limit 10s --launch -- <your_excutable_file> <excutable_args>

Compiling in debug mode and running for 10s generates cpu time profile. Where it's really easy to see where cpu is spending most time. This helped to detect which part to focus and did that actually worked.

LICENSE

License

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
input		input
src		src
xctrace_screenshot		xctrace_screenshot
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1️⃣ 🐝 🏎️ (One Billion Row Challenge)

Table of Contents

Dependency

Compiling

Input

Run

Levels

Eureka 🤯

Instruments

LICENSE

About

Releases

Packages

Languages

License

xpd54/1brc

Folders and files

Latest commit

History

Repository files navigation

1️⃣ 🐝 🏎️ (One Billion Row Challenge)

Table of Contents

Dependency

Compiling

Input

Run

Levels

Eureka 🤯

Instruments

LICENSE

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages