Manipulating large data – Lab4 SPO 600

Welcome, in class, we learned about a few different ways you can represent data in a computer. We learned about integers, fixed-point, floating-point, graphics, and sound. This lab focuses on sound data, which is one of the larger forms of data.

How we record sound?

The way we record sound is by sampling it. The typical sample rate is 44.1 or 48 thousand samples per second per channel and each sample size is typically 16 or 24 bit.

What I did for this lab

Task 1

My first task was to benchmark a program that manipulated a random dataset that was simulating a 16 bit sound file. The way we made the data was by generating five million signed integers between positive 32767 and negative 32768.

Here is the code for task 1. – Task1

Task 1 – Results

Result: 94
real 0m0.058s
user 0m0.053s
sys 0m0.005s

Task 1 – Analysis

Using the profiler Perf I can see that the total time running the scaling code is around 18.62% of the program runtime. About 81.38% of the time was generating random data.

Task 2

My second task was to change the formula for scaling the samples to a pre-calculated lookup table. The lookup table would contain all possible sample values multiplied by the volume factor. To get the results, you use the value as the index to look up each sample in that table to get the scaled values.

Here is the code for task 2. – Task 2

Task 2 – Results

Result: 94
real 0m0.646s
user 0m0.633s
sys 0m0.011s

Task 2 – Analysis

Using the profiler Perf I can see that the total time running the scaling code is around 34.27% of the program runtime. About 65.73% of the time was generating random data.

Task 3

My third task was to change the formula for scaling the samples to use fixed point integer math. The reason for doing this is that on most machines they can calculate integer math faster than floating-point. The way I accomplished this was by bit shifting.

The formula I used was: (((246 * 0.75) * SAMPLE) >> 8)

Here is the code for task 3. – Task 3

Task 3 – Results

Result: 873
real 0m0.522s
user 0m0.501s
sys 0m0.020s

Task 3 -Analysis

Using the profiler Perf I can see that the total time running the scaling code is around 18.24% of the program runtime. About 81.76% of the time was generating random data.

Final Analysis

I tested each test multiple times. The results were all similar, with a small variant to speed and the same output every time. The computer doing other tasks is the likely cause for the slight variant in test speeds.

I also tried with a larger sample size than 5 million and the result was the same.

The final results are fixed point integers give us the fastest result, floating-point math came in second, and the lookup table was the slowest.

Now there was a downfall to the fixed point integers. It did not give the same result as the floating-point and lookup table. The result was different since it does not calculate the decimals. The fixed point integers delete the decimal without rounding. By not rounding the numbers, the result was slightly different.