What is Vulkan ?

We have platforms like LeetCode and Codeforces that rank people based on time of submission, correctness, and algorithmic efficiency.

What we want to build is a similar platform, but for real development tasks.

Users would submit their implementations based on the problem statement, and Vulkan would evaluate all submissions and rank users according to metrics and scoring methods we define.

Here are some core problems I have identified so far (WIP):

1. Why is this needed ?

Right now, we don’t have a platform that truly judges candidates based on software development skills in realistic scenarios. CP measures algorithms, but it doesn’t measure system design, scalability, resource efficiency, or reliability.

Vulkan would let candidates compete to build faster, more efficient, and more robust implementations of real world problems. It creates a competitive environment where engineering / LLD or HLD decisions actually matter

2. How do we evaluate generalized problems ?

Most difficult problem in this project. Unlike algorithmic problems where I/O is fixed, submissions could be completely different systems here, which makes evaluation difficult. How do we compare two completely different implementations fairly? 

One approach is to define API contracts for each problem.

For example, a KV store task would expose APIs that Vulkan can test, and a dependency graph task would expose its own APIs. This gives us a way to measure correctness, performance, and other metrics consistently.

But even then, candidates would need to implement these APIs themselves.

How do we make this easier? We can provide language specific templates, so candidates can focus on the system itself, not the boilerplate (Like Leetcode).

3. Safe execution of arbitrary code

Candidates can submit ANY code. It could crash the system, consume excessive resources, or interfere with other submissions. This makes safe execution difficult. 

One solution is to run each submission in sandboxed environments like Docker or K8s, with strict resource limits and isolation.

4. Fair metric collection

Measuring latency, throughput, memory usage, and reliability sounds simple now, but in practice it may not be.

CPU contention, network jitter can all affect results. How do we ensure fairness? We run workloads on dedicated nodes, use deterministic inputs wherever possible, and run multiple iterations to get stable metrics? (WIP)

6. Workload generation and testing

We need workloads that actually stress the system realistically. Millions of GET/SET operations for a KV store, high-throughput reads and writes for a cache service, thousands of concurrent requests for a rate limiter. How do we create workloads that are both fair and challenging?

8. Scoring

We want to rank candidates not just by correctness, but also performance and efficiency. How do we combine these into a single, fair score?

9. Feedback

Candidates need meaningful feedback. How do they know why one submission scored better than another?

We can provide logs, performance brkdowns and correctness reports, so the platform is both competitive and educational.

(This thread is a WIP, feedback is welcome!)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is Vulkan ? #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What is Vulkan ? #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions