Image Processing on tiny-gpu

Published:

This assignment is intended for prospective undergraduate researchers. If you are a prospective graduate researcher, please refer to the AI Hardware Task or the NTT Task. This task introduces you to GPU-style parallel programming using a simplified system. You will implement a basic image processing application, execute it on a minimal GPU platform, and evaluate its behavior and performance.

The goal is to assess your ability to:

  • Understand a new system quickly
  • Map a real-world problem to parallel execution
  • Reason about performance and system limitations

Tiny-GPU Setup

tiny-gpu is a minimal, educational GPU-like system designed to demonstrate the fundamentals of parallel execution. Unlike modern GPUs, it is intentionally simplified and lacks many performance optimizations and features. This makes it easier to understand how threads are scheduled, how memory is accessed, and how parallel workloads are structured. In this assignment, you will use tiny-gpu as a platform to map a simple real-world computation onto a parallel execution model.

Clone and build the tiny-gpu repository:

git clone https://github.com/adam-maj/tiny-gpu.git
cd tiny-gpu

Follow the repository instructions to compile and run the provided examples. As you do this, familiarize your self with cocotb python based testbench environment and briefly examine how GPU kernels are written and executed. Focus on understanding how threads are launched and how work is divided across them. You are not expected to fully understand the system internals, just enough to modify and run your own kernel.

Task: Image Brightness Adjustment

You will implement a simple image processing operation: increasing the brightness of a grayscale image.

For each pixel:

\[output(x, y) = \min(255, input(x, y) + k)\]

where:

  • \(input(x, y)\) is the original pixel value (0–255)
  • \(k\) is a constant brightness factor (e.g., 30–80)
  • The result must be clamped to 255

You may consider the following for your implementation:

  • Represent the image as a 1D or 2D array
  • Assign one thread per pixel
  • Each thread:
    • Reads one pixel
    • Applies the brightness operation
    • Writes the result back
  • Your implementation should:
    • Work for multiple image sizes
    • Produce correct output (proper clamping)
    • Be clearly structured and documented everything in a repository

Performance Evaluation

You may use following two images for your evalautions.

Green Parrot Red flower
Sample images for the task.

Run your brightness-adjustment kernel using multiple image sizes, for example:

Image SizeNumber of Pixelstiny-gpu Execution Time
64×644,096 
128×12816,384 

Report the following:

  • Whether the output is correct, you may use python unit testing for this.
  • How execution time changes with image size
  • Any limitations encountered when scaling

Public Artifact Repository

You must create a public repository containing your solution. Repository must include at least following components:

  1. Source Code
  2. Testbench
  3. Input Files
  4. README.md

Interview Expectation

During the interview, you will be expected to:

  • Walk through your implementation
  • Explain how pixels are mapped to tiny-gpu threads
  • Explain your design decisions
  • Discuss trade-offs (e.g., image size, latency, memory access, system limitations)
  • Demonstrate your simulation setup

This task is intended as a concise illustration of the core skills expected for this position. It is not designed to have a single “correct” solution. Instead, you are encouraged to explore different design approaches, experiment with trade-offs, and justify the decisions you make.

Focus on demonstrating clarity of thought, sound engineering judgment, and the ability to connect a simple real-world computation to practical hardware execution.

Good luck and happy coding/designing!