Day 15 — Tutorials, Again #4

September 30th, 2020 2 comments

Hey, all! Welcome to CryptoCL.

Today, I logged into the lab system to work on the Exercises of the Hands-On OpenCL tutorial. I made some progress into Exercise #09, and have an implementation written, but the program currently results in a segmentation fault. I had only a little time to work on it, since I had other errands to run and the weather was poor, but once I can fix the errors in my implementation, I can work to refine the code and have it pass the exercise.

As far as actual work, it was a slow day, but I feel it was a good day.

See you next time!

Kyle Jenkins

Time spent today: 1 hour 15 minutes
Total Time: 21 hours 30 minutes

Categories: Uncategorized Tags: ,

Day 14 — Tutorials, Again #3 and The Game Plan

September 28th, 2020 Comments off

Hey, all! Welcome to CryptoCL.

Today, I made some progress on my Hands-On OpenCL exercises and met with Dr. Marmorstein about what’s next for the project.

The issue I was having with Exercise #07 was that I had assumed that the code I was given was a base for what I needed to do, and was correct. This was a wrong assumption, as a cl::NDRange variable called global, which states what the dimension range of the given arguments for the kernel function, was incorrect*. It was set to be a two-dimensional variable, wherein it was supposed to be one-dimensional. After setting it to be one-dimensional of size N, and then declaring a cl::NDRange local variable of size N divided by 16, which means it would be 64 (as N is size 1024). For Exercise #07, I also needed to store the values of the A matrix to private memory. This saw an x4 improvement to the OpenCL portion of the program, compared to Exercise #06.

Exercise #08, on the other hand, utilized the local memory declaration, and called for me to store the B matrix to local memory. This further increased the time.

*Note: As I am writing this post, I believe this implementation may be wrong, as these matrices are two-dimensional. I will have to look at these exercises again and make sure.

Now reaching Exercise #09, I met with Dr. Marmorstein about what the plan would be for this week. The goal is to continue through the exercises, and then next week we will continue attempting to take what we learned into the BLAKE2 implementation, however, the chances are that we will utilize a custom cryptograph, instead. The custom hash would be able to excel with confusion, where the ciphertext and key become so scrambled and altered that it is difficult to establish a relationship between them, using OpenCL. However, the hash would have a difficult time with diffusion, in which each byte of the plaintext should equate to a portion of the ciphertext (and change the ciphertext if changed).

Over this week, I will be continuing with my exercises, and will post my progress as I do so. Thank you for reading!

Kyle Jenkins

Time spent today: 1 hour 15 minutes
Total Time: 20 hours 15 minutes

Categories: Uncategorized Tags: ,

Day 13 — Tutorials, Again #2

September 24th, 2020 3 comments

Hey, all! Welcome back to CryptoCL.

Today I continued my work on the Hands-On OpenCL tutorial. I didn’t make much headway, since I was busy juggling another project that needed more urgent attention along with these tutorials.

Last time, I was having trouble with Exercise 06. I was able to figure out that issue and solve it, as the code that was given was causing issue. Perhaps they were in anticipation for systems with multiple Devices? Simply commenting out that code fixed the issue, and I replaced it by defining a DEVICE variable that sets the device type to default. After that issue was fixed, the program runs the sequential calculation well, but then is unable to solve the OpenCL computation. I was stuck on this for an hour, until I noticed that the kernel source code did not have an implementation. I wrote an implementation, but found it wasn’t the correct implementation. The Hands-On OpenCL git repository actually provides an implementation in a presentation slideshow file. Once implemented, the code now correctly calculates vector multiplication matrices!

I moved on to Exercise07, where the goal is to calculate vector multiplication matrices by row, instead of at row x and column y. I edited the kernel source, but was unable to get the exercise working.

Next time, I will continue working on my exercises. Dr. Marmorstein has been sick for the week, so we might or might not meet tomorrow. Until next time!

Kyle Jenkins.

Time spent today: 3 hours
Total Time: 19 hours

Categories: Uncategorized Tags: ,

Day 12 — Tutorials, Again #1

September 20th, 2020 Comments off

Hey, all! Welcome to CryptoCL.

Today, I did a few of the Exercises in the Hands-On OpenCL tutorial mentioned yesterday. I’ve done every tutorial up to and stopped at Exercise #6.

The first two tutorials were mostly to check your system to see if you were able to run OpenCL on your system. Exercise #3 had you analyze a given program and understand what is happening within the program — it was adding two vectors together, and would return how many passed and would also return any equations that were wrong.

The fourth one is where things would get tricky. It was the same program, but they wanted you to run the program 3 times, adding vectors D, E, F, and G. First you would find C = A + B, then D = C + E, and finally return F = D + G. I ran the same function three times, except I replaced the inputs with the inputs I needed, and changed C and D to Read and Write. Eventually, I got the program to work after also changing places where I was supposed to return C values to be places I return F values.

The fifth one, I feel, was easier than the fourth exercise. In the fifth exercise, you had to change the program and kernel to add an additional vector, which I named D. Nothing was entirely complicated about it — the only thing I needed to do was add the D vector to wherever I needed it.

I stopped at the sixth exercise, which will require me to create an OpenCL program from scratch to multiply vectors. However, using the previous programs as a base, I think this exercise will be easy.

Next time, I’ll talk more about the upcoming Exercises in the tutorial. Stay tuned!

Kyle Jenkins.

Time spent today: 1 hour 15 minutes
Total Time: 16 hours

Categories: Uncategorized Tags: ,

Day 11 — Back to Tutorials

September 19th, 2020 3 comments

Hey, all. Welcome to CryptoCL!

I met with Dr. Marmorstein yesterday. We discussed what we had developed toward OpenCL implementation of BLAKE2. Dr. Marmorstein was quite a bit further than I with his BLAKE2b implementation, but also ran into an issue. His issue was that the size of data he was trying to give was still too small. It was still larger than mine, which was at 100000, but still too small for the operations we were trying to accomplish.

We made the decision to go back to doing tutorials. Earlier in the project development, Dr. Marmorstein found a tutorial called “Hands On OpenCL“, written by Simon McIntosh-Smith and Tom Deakin. The goal of the tutorial is to provide exercises to educate on how OpenCL works. After taking a look at the files, it does seem like a very useful and effective way to learn OpenCL. We decided to spend the week working on these exercises.

This week will definitely be much more tame than the last week or so of development time, but will be essential in exercising and testing our understanding of the OpenCL standard. Expect the next few posts to be about the tutorial.

Thank you, and see you next time!

Kyle Jenkins.

Time spent today: 1 hour
Total Time: 14 hours 45 minutes

Categories: Uncategorized Tags: ,

Day 10 — 10 Days of CryptoCL, and Ongoing Implementation

September 17th, 2020 Comments off

Hello, all! Welcome to CryptoCL.

Firstly, thank you for joining me on Day 10 of CryptoCL. Every day I work on CryptoCL, I log my progress and thoughts, and I’m glad you all are joining me on this ongoing project. Thank you!

Back on topic, Implementation of OpenCL into BLAKE2s continues. I managed to find a slight workaround for the issue I stopped on, which is to simply set the second argument as 0. While in the main function, the argument is supposed to be 0-7 for all 8 instances of G being called, I’m only going to work on one instance, until things work. I just have the result of the kernel operation stored into “part” variables, from p1-p4, for v[0], v[4], v[8], v[12], respectively. Then, I free up the memory these cl objects take up within ROUND and main once I”m done.

Issues I’ve run into today is an issue of needing to make new definitions of blake2s functions. The function blake2s_compress is easy to fix, as that is declared and defined within the new blake2s-ref-driver.c file. However, functions like blake2s_init_key, blake2s_update, blake2s_final, and blake2s are declared in the blake2.h file. I could redefine them, but that would break other files within the directory that rely on those functions. I chose to copy those functions and make new definitions, which differ from the original to include the OpenCL objects (Like cl_context and cl_kernel) and the “_driver” suffix.

I was running into an issue on compiling, where the compiler couldn’t recognize OpenCL functions — but then I remembered I ran into this issue already, and included the OpenCL library in my make file.

The compiler is still having an issue at the moment — there’s an undefined reference to main. I don’t use makefiles often, and this seems to be an issue with being unable to make an output file. I will take a look tomorrow.

Next time, it will be time to finish the implementation and start bug hunting.

Thank you again for 10 days! See you next time!

Kyle Jenkins.

Time spent today: 1 hour 45 minutes
Total Time: 13 hours 45 minutes

Categories: Uncategorized Tags: ,

Day 9 — The Plan, and OpenCL Implementation into BLAKE2s

September 15th, 2020 Comments off

Hey all, welcome to CryptoCL.

After talking with Dr. Marmorstein, we both decided we will both use one version of BLAKE2 to implement using OpenCL. I am tasked to implement OpenCL with the BLAKE2s version of BLAKE2.

However, I was a bit confused about our last meeting — I mistook my work to be to implement the ROUND function as an OpenCL kernel. This was a mistake, as I needed to implement the G function as the OpenCL kernel. After a quick fix, I began to implement various OpenCL functions into the BLAKE2s implementation.

I had a choice to make in whether or not I should start creating OpenCL objects in the main function, or later on in the ROUND function. Doing the former means that the program will be quick to end if there is an issue with building the kernel, however these variables will have to be carried from the main all the way to wherever the ROUND function is declared. Doing the latter meant that, while everything was concisely packed into the ROUND function, the program would have travelled pretty far already, and be difficult to manage the memory safely. Weighing the options, I opted for the latter, and decided to create the OpenCL objects within the main function.

Memory objects that were to be used by the kernel, however, will be created within the ROUND function, and disposed of at the end of the ROUND function.

I decided to stop in the middle of my work after looking over it for a long while, and since I have other obligations to do, stopping at the point where the kernel arguments were being assigned to the function. I stopped at a good place, too — the G function takes in an unsigned int i, which is the numbers 0-7 within the ROUND function. I might need to understand more of how OpenCL works, because I am not quite sure how to implement this as a argument for the kernel. I’ll do some more research to find out, and continue next time.

Next time, I should be able to finish implementation of the OpenCL version of BLAKE2s, and then it will be time to fix errors or bugs (but in an ideal world, the program has no errors and bugs and I can move onto testing, but this is not the likely outcome!)

See you next time!

Kyle Jenkins.

Time spent today: 3 hours
Total Time: 12 hours

Categories: Uncategorized Tags: ,

Day 8 — BLAKE2 Implementation #2, A New Direction?

September 12th, 2020 Comments off

Hello, all! Welcome to CryptoCL.

Today, I completed implementing BLAKE2b and BLAKE2s. It appears that, although I cannot accurately tell what is happening when running the programs, they do successfully work.

I met with Dr. Marmorstein yesterday to discuss our next move. We explored through the code, and ran into an issue — a core function of BLAKE2, known as function G, runs sequentially. On the outset of a normal program, this is not an issue. However, for OpenCL, this is grave news. As discussed earlier, OpenCL is used to allow for parallelism. However, since the hash function for BLAKE2 requires values to be updated and then used elsewhere within the same function, it is impossible to be able to run the function in parallel, as information would not be properly updated or even overwritten.

We decided to try and brainstorm a new plan of action. In the meantime, I worked to implement the ROUND function as OpenCL kernels. ROUND calls the function G eight times — the first four and the second four are independent from each other. Ergo, we believe that if we run this function as two kernels for each half of the ROUND function, we can at least achieve some form of parallelism.

For both BLAKE2b and BLAKE2s, I implemented two files for each called “blake2?_round(x)_kernel.cl”, where ? indicates either b or s, and x signals 1 or 2 for the top or bottom half of the ROUND function, respectively. These functions are basically untouched from how they are presented in the original function — just now the function has a __kernel prefix and all variables have a __global prefix.

Next time, I will meet with Dr. Marmorstein to discuss where to go next with this project — chances are, we will design our own, primitive cryptographic hashing algorithm to implement that will be able to run in parallel. One idea is run operations of pieces of the input in parallel, combining the results together until we get the encrypted message.

Until then, have a good night!

Kyle Jenkins.

Time spent today: 1 hour 45 minutes
Total Time: 9 hours

Categories: Uncategorized Tags: ,

Day 7 — BLAKE2 Implementation

September 10th, 2020 Comments off

Hi, all! Welcome to CryptoCL.

Firstly, an update on the issue with the Rob Farber tutorial: After testing it on the lab system, the program successfully ran and passed all the tests. The lab systems have support for OpenCL, while my system at home does not. Ergo, using the lab systems to handle programs that deal with OpenCL will be crucial.

Next, BLAKE2 implementation has begun! Firstly, we are grabbing from the BLAKE2b and BLAKE2s (abbreviated as B2b and B2s hereafter) implementations provided from the BLAKE2 git repository. After browsing the code, I compiled both versions of B2b and B2s (One version specializes in speed, while the other complements portability and simplicity, as stated in the README). They all compiled and ran successfully, at least from what I can tell — looking through the code shows that the program prints “ok” when it runs without errors.

The problem is that the code doesn’t give an other information. When testing cryptographic functions, I would usually input some sort of input or the like, and compare to what the answer is supposed to be. However, the implementation provided does not given any other information besides if the program was successful in running.

Tomorrow, I will continue looking through the code. One, to study the implementation for the OpenCL-compatible version, and to see how I can accurately test the code.

Have a good day!

Kyle Jenkins.

Time spent today: 1 hour
Total Time: 7 hours 15 minutes

Day 6 — Tutorial #2, Decisions, and Name Change

September 9th, 2020 1 comment

Hi, all. Welcome to CryptoCL.

Firstly, the name change. I think the name change was a matter of time. I’m a bit disappointed it’s not a complete acronym, but I think there will be… less problems with this new one, so that should definitely outweigh the cons.

Now for the actual content of the blogpost — today I had begun implementing the Rob Farber tutorial. An interesting difference from the Erik Smistad tutorial from previous posts is that, besides being in C++/C respectively, Farber chose to implement his kernel source code as a constant character array, rather than its own file. I think I prefer Smistad’s method, however, as that will keep the kernel files separate from the main files and easier to find.

The same issue of the clCreateCommandQueue function call being deprecated in Smistad’s tutorial was also present in Farber’s tutorial, and was simply fixed the same way as last time — by adding “WithProperites” to the end of the function call name. I also ran into a simple bug where I forgot to include the stdc++ library in my compile command call.

The program now compiles and runs. However, the results are not what I expected it to be — this may just be an issue with the remote access, and needs to be tested physically. I double-checked by running Smistad’s test remotely, which also didn’t work correctly.

By the next blog post, I will run the program to make sure things are running smoothly. I will also begin with the actual implementation of BLAKE. We are deciding which version of BLAKE2 to implement — either BLAKE2b or BLAKE2s. 2b is optimized for 64-bit platforms, while 2s handles 8- and 32-bit platforms, as explained by the BLAKE2 RFC under 1. Introduction and Terminology. We may even implement both!

See you next time, and thank you for reading!

Kyle Jenkins.

Time spent today: 1 hour 15 minutes
Total Time: 6 hours 15 minutes