Show Overlapping

Overview
Test
Implementation

Overview

Determine the number of overlapping threads or api calls based upon the start time and end times and output the maximum number.

It is used to discover the amount of parallelism, which can be useful in debugging issues or configuring for optimal processing.

Optionally show the identifier for each process as well.

Written to learn Python.

Input

Given an input file in the format 'start,end,process id'

2020-12-30T21:56:44.967572,2020-12-30T21:57:44.042572,id0
2020-12-30T21:57:43.006572,2020-12-30T21:58:35.771572,id1

Output

Outputs lines in the form 'start, end, number of processes, list of each process'

2020-12-30T21:56:44.967572,2020-12-30T21:57:43.006572,58039,1,['id0']
2020-12-30T21:57:43.006572,2020-12-30T21:57:44.042572,1036,2,['id0', 'id1']
2020-12-30T21:57:44.042572,2020-12-30T21:58:35.771572,51729,1,['id1']

Test

Generate Test Data

generate-testdata.py provides testdata

python3 test/generate-testdata.py --count 30 | python3 overlap/overlap.py

API Manager Test

cat test/api-input.csv | python3 api-manager/api_manager.py | python3 overlap/overlap.py

Implementation

Consider 3 threads each of which run units of work. On this diagram, Thread 1 processes Work A, then pauses, then processes Work D. Time passes from left to right.

We can break up the time into buckets, where each bucket begins when a unit of work either starts or finishes. On this diagram there are 6 buckets.

Collect all the start and end points of each unit of work into an ordered list of timestamps

nodes = [
    T1 = [Astart, Bstart], 
    T2 = [Aend],
    T3 = [Cstart],
    ...
    T7 = [Dend]
]

Then create a bucket for each node

for each node in nodes
    bucket = copy of previous bucket
    for each point in node
        if point is start
            add Work to the bucket
        else // point is end
            remove Work from the bucket

Each bucket contains a list of Work that was performed.

bukets = [
    [A, B],   // B1
    [B],      // B2
    [B, C],   // B3
    [C],      // B4
    [C, D],   // B5
    [D]       // B6
]

The bucket with the largest list represents the most parallism. By storing the start and end of bucket you can also work out the duration of the buckets.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
api-manager		api-manager
docs		docs
overlap		overlap
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

api-manager

api-manager

docs

docs

overlap

overlap

test

test

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Show Overlapping

Overview

Input

Output

Test

Generate Test Data

API Manager Test

Implementation

About

Releases

Packages

Contributors 2

Languages

License

seryckd/show-overlapping

Folders and files

Latest commit

History

Repository files navigation

Show Overlapping

Overview

Input

Output

Test

Generate Test Data

API Manager Test

Implementation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages