Vjeux

Dec

My CS Degree at EPITA Was Worth It

vjeux EPITA Add you comment

It's very trendy to bash at Computer Science degrees saying that it costs a lot of time and money and at the end, you haven't learned much useful things for your day job. My experience going to EPITA, a French school in Paris, has been the complete opposite!

I started programming when I was around 10. By 13, I was already being contracted for real money by someone in the US (Hi Thott!). So, when I started EPITA at 18, I already had a ton of experience as a self-taught programmer. My biggest fear was: "Would I learn something new?" and, I did learn a ton! I know wouldn't have done all the impactful things I have at Facebook without it.

What struck me was that I already knew a lot and I knew there was a lot left to learn, but I didn't really know what nor how to get started learning it on my own.

Big Themes

Here are some themes where I learned entirely new domains of knowledge at school, many of which I would likely not have learned, or at least not as in depth if I didn't do a CS degree.

C: I've written a ton of C at school and it was super annoying but learning how to do manual memory management, use pointers and have raw access to memory was a huge breakthrough in how I understood how code was actually running.
Data Structures and Algorithms. I had 6 hours of courses per day for 2 years. There's so much to learn! What's interesting is that you don't need any of it to build what you want, but once you hit scale and you want to optimize and maintain it, knowing about all of them become critical!
OCaml: The whole idea of manipulating immutable lists via recursion to actually do something useful at the end was very weird. Even though I still despise this style of programming, there are good ideas behind it and React is a good example of a practical system that makes use of them.
Source Control: I used to change index2.php on the production machine and cp it to index.php using sftp when I was happy with my changes... for a website with 1 million unique visitors a day... It was fun and scary, but I wouldn't go back to that in a million years!
Assembly: Learning how a computer works from nand gates all the way to assembly was really mind blowing. There are so many levels of abstractions happening between the hardware and some piece of JavaScript that executes in the browser that it's hard to believe it works and isn't crazy slow.
Machine Learning (it wasn't hype at the time) is really not that magical. It's a bunch of heuristics in order to massage the data in a way that it can be separated by a line. It also made me happy that finally all the linear algebra I've learned for years was useful to something. (Math was also useful for all the signal processing stuff with fourrier transform!)
Academia. My theory is that nothing is new, everything has already been explored but because people there insist on using latex with maths symbols everywhere and publish to closed platforms, it doesn't get the reach it deserves.
Tradeoffs. If I were to summarize all the learnings, it's probably that there is not one perfect solution. There's usually a bunch of solutions with different sets of tradeoffs and you need to find the least shitty one. It was a sad realization but gave me a much better framework to work with.

Specific School Projects

What was awesome about EPITA is that not only we had classes like any normal school but also a ton of actual non-trivial projects to implement. Here are some that had a profound impact on me:

Reimplementing Warcraft 3 from scratch. I learned so much about parsing (and reverse engineering) data structures, the whole 3D rendering space... But also having a project with a team of 4 people for an entire year and how to sell. It was a pretty awesome first year project 🙂
Reimplementing malloc from scratch (btw, printf uses malloc which makes debugging interesting...). Learned so much about memory management and running firefox using my malloc was so awesome!
Reimplementing bash from scratch. I learned a ton about how to parse and execute programming languages and how crazy the unix low level apis are. I'm still using this knowledge today when writing scripts thanks to this project.
Implementing a regex execution using OCaml. OCaml is --really-- good for working with well defined data structures and algorithms (I have big reservations for anything else outside of this narrow scope...). Since I'm using regexes almost on a daily basis, it's been super helpful.
We had 4-hour machine exams every couple of weeks for a year where you basically have to solve as many interview-style problems as possible. This was so much fun and helped to land a job at Facebook.
Implement the fastest possible fuzzy finder. I learned so much about orders of magnitude in perf: just opening the file in JavaScript took way too long, C is faster... malloc is really slow, custom bump allocator is better. Smaller structs is better, bit packing ftw. Pre-computing data for heuristics can dramatically help.

And many more... but that's a good enough list for now!

Actual Use Cases

Now, the big question is: but did you use any of that at work? And the answer is yes, so many times!

Small thing but when building our open source websites, I wanted to only write next: page in the markdown and not have to have to write prev nor have a global index and yet have an table of content. It took me a second to realize that I had a list of edges of a graph and I needed to rebuild a graph from it. Because I've written so many graph traversals at school, it took me under an hour to build it correctly.
EPITA enforced strict lint rules where for each violation you'd lose 2 points (out of 20) for all your assignments. The trick is that you were only given a pdf with the rules and no program to verify them, so you had to learn them the hard way. This motivated me years later to work on prettier so that we can have the benefits of a consistent codebase, the less hardcore way...
When I was pitched React for the first time, it sounded to me crazy that re-rendering everything could be fast, but once I was proved it was not the case, I already knew it would work. I already made the mental journey of all thinking about the tree traversals, functional programming paradigms... It didn't sound foreign to me as it did for so many other people.
I've spent a lot of time building lint rules, codemods, extended syntax, pretty printers... because I've had excellent courses on how to parse and execute programming languages and had to build many at school, the challenge was not about learning how it works but more how to actually implement it in the context of JavaScript.
While on the photos team at Facebook, one of the issue was that we cropped photos in a terrible way. Building an algorithm to find the best crop turned out to be manageable because I knew how to read the literature around existing approaches. Once I settled on one, I already wrote a ton of similar algorithms so it was just a matter of writing code down with little challenge.

Conclusion

I could have gotten a well paying programming job right out of high school as a web developer, but going to EPITA made me such a better programmer because I've been exposed in depth to a lot of areas that I would likely have never gone on my own.

Now, if I see a problem on a web developer task, I also know that changing the programming language, using machine learning, writing code in a more C-like way, using other data structures and algorithms... are possible tools to solve that problem. More importantly, if I need to use those, I can implement them because I've already done similar things in the past.

Your experience may vary, but for me, getting a CS degree at EPITA was a really good decision and I would do it again if I had to.

Apr

Climb – Property-based dispatch in functional languages

vjeux EPITA, Lisp Add you comment

ELS Presentation | A Generic and Dynamic Approach to Image Processing | Chaining Operators & Component Trees | Property-based dispatch in functional languages

This is the third (and last) presentation about my work on Climb at the LRDE. During the first one I tackled genericity on data structures, the second was about genericity on values and this one talks about genericity on algorithms.

Climb - Property-based dispatch in functional languages

Abstract: "Climb is a generic image processing library. A generic algorithm interface often requires several different specialized implementations. Olena, a C++ library, solves this using properties.

We present a way to dispatch a function call to the best specialized implementation using properties in a dynamic programming language: Common Lisp. Then, we introduce examples of algorithms and properties used in image processing."

Jan

Javascript Ray Tracer

vjeux EPITA, Javascript, Project Add you comment

Here is a report of the Ray Tracer written by myself Christopher Chedeau. I've taken the file format and most of the examples from the Ray Tracer of our friends Maxime Mouial and Clément Bœsch. The source is available on Github.

It is powered by Open Source technologies: glMatrix, CodeMirror, CoffeeScript, Twitter Bootstrap, jQuery and Web Workers.

Check out the demo, or click on any of the images.

Objects

Our Ray Tracer supports 4 object types: Plane, Sphere, Cylinder and Cone.

The core idea of the Ray Tracer is to send rays that will be reflected on items. Given a ray (origin and direction), we need to know if it intersect an object on the scene, and if it does, how to get a ray' that will be reflected on the object.

Knowing that, we open up our high school math book and come up with all the following formulas.

Legend: Ray Origin \(O\), Ray Direction \(D\), Intersection Position \(O'\), Intersection Normal \(N\) and Item Radius \(r\).

	Intersection	Normal
Plane	\[t = \frac{O_z}{D_z}\]	\[ N = \left\{ \begin{array}{l} x = 0 \\ y = 0 \\ z = -sign(D_z) \end{array} \right. \]
Sphere	\[ \begin{array}{l l l} & t^2 & (O \cdot O) \\ + & 2t & (O \cdot D) \\ + & & (O \cdot O) - r^2 \end{array} = 0\]	\[ N = \left\{ \begin{array}{l} x = O'_x \\ y = O'_y \\ z = O'_z \end{array} \right. \]
Cylinder	\[ \begin{array}{l l l} & t^2 & (D_x D_x + D_y D_y) \\ + & 2t & (O_x D_x + O_y D_y) \\ + & & (O_x O_x + O_y O_y - r^2) \end{array} = 0\]	\[ N = \left\{ \begin{array}{l} x = O'_x \\ y = O'_y \\ z = 0 \end{array} \right. \]
Cone	\[ \begin{array}{l l l} & t^2 & (D_x D_x + D_y D_y - r^2 D_z D_z) \\ + & 2t & (O_x D_x + O_y D_y - r^2 O_z D_z) \\ + & & (O_x O_x + O_y O_y - r^2 O_z O_z) \end{array} = 0\]	\[ N = \left\{ \begin{array}{l} x = O'_x \\ y = O'_y \\ z = - O'_z * tan(r^2) \end{array} \right. \]

In order to solve the equation \(at^2 + bt + c = 0\), we use
\[\Delta = b^2 - 4ac \]\[
\begin{array}{c c c}
\Delta \geq 0 & t_1 = \frac{-b - \sqrt{\Delta}}{2a} & t_2 = \frac{-b + \sqrt{\Delta}}{2a}
\end{array}
\]

And here is the formula for the reflected ray:

\[
\left\{
\begin{array}{l}
O' = O + tD + \varepsilon D' \\
D' = D - 2 (D \cdot N) * N
\end{array}
\right.
\]

In order to fight numerical precision errors, we are going to move the origin of the reflected point a little bit in the direction of the reflected ray (\(\varepsilon D'\)). It will avoid to falsely detect a collision with the current object.

Coordinates, Groups and Rotations

We want to move and rotate objects. In order to do that, we compute a transformation matrix (and it's inverse) for each object in the scene using the following code:

\[
T = \begin{array}{l}
(Identity * Translate_g * RotateX_g * RotateY_g * RotateZ_g) * \\
(Identity * Translate_i * RotateX_i * RotateY_i * RotateZ_i)
\end{array}
\]\[ I = T^{-1} \]

\[Translate(x, y, z) = \left(\begin{array}{c c c c} 1 & 0 & 0 & x \\ 0 & 1 & 0 & y \\ 0 & 0 & 1 & z \\ 0 & 0 & 0 & 1 \end{array}\right)\]	\[RotateX(\alpha) = \left(\begin{array}{c c c c} 1 & 0 & 0 & 0 \\ 0 & cos(\alpha) & -sin(\alpha) & 0 \\ 0 & sin(\alpha) & cos(\alpha) & 0 \\ 0 & 0 & 0 & 1 \end{array}\right)\]
\[RotateY(\alpha) = \left(\begin{array}{c c c c} cos(\alpha) & 0 & sin(\alpha) & 0 \\ 0 & 1 & 0 & 0 \\ -sin(\alpha) & 0 & cos(\alpha) & 0 \\ 0 & 0 & 0 & 1 \end{array}\right)\]	\[RotateZ(\alpha) = \left(\begin{array}{c c c c} cos(\alpha) & -sin(\alpha) & 0 & 0 \\ sin(\alpha) & cos(\alpha) & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right)\]

We have written the intersection and normal calculations in the object's coordinate system instead of the world's coordinate system. It makes them easier to write. We use the transformation matrix to do object -> world and the inverse matrix to do world -> object.

\[
\left\{\begin{array}{l}
O_{world} = T * O_{object} \\
D_{world} = (T * D_{object}) - (T * 0_4)
\end{array}\right.
\]

\[
\left\{\begin{array}{l}
O_{object} = I * O_{world} \\
D_{object} = (I * D_{world}) - (I * 0_4)
\end{array}\right.
\]

\[0_4 = \left(\begin{array}{c} 0 \\
0 \\
0 \\
1
\end{array}\right)
\]

Bounding Box

The previous equations give us objects with infinite dimensions (except for the sphere) whereas objects in real life have finite dimensions. To simulate this, it is possible to provide two points that will form a bounding box around the object. On the intersection test, we are going to use the nearest point that is inside the bounding box.

This gives us the ability to do various objects such as mirrors, table surface and legs, light bubbles and even a Pokeball!

Light

An object is composed of an Intensity \(I_o\), a Color \(C_o\) and a Brightness \(B_o\). Each light has a Color \(C_l\) and there is an ambient color \(C_a\). Using all those properties, we can calculate the color of a point using the following formula:

\[
I_o * (C_o + B_o) * \left(C_a + \sum_{l}{(N \cdot D) * C_l}\right)
\]

Only the lights visible from the intersection point are used in the sum. In order to check this, we send a shadow ray from the intersection point to the light and see if it intersects any object.

The following images are examples to demonstrate the lights.

Textures

In order to put a texture on an object, we need to map a point \((x, y, z)\) in the object's coordinate system into a point \((x, y)\) in the texture's coordinate system. For planes, it is straightforward, we just the \(z\) coordinate (which is equal to zero anyway). For spheres, cylinders and cones it is a bit more involved. Here is the formula where \(w\) and \(h\) are the width and height of the texture.

\[
\begin{array}{c c}
\phi = acos(\frac{O'_y}{r}) & \theta = \frac{acos\left(\frac{O'_x}{r * sin(\phi)}\right)}{2\pi}
\end{array}
\]\[
\begin{array}{c c}
x = w * \left\{\begin{array}{l l} \theta & \text{if } O'_x < 0 \\ 1 - \theta & \text{else}\end{array}\right. & y = h * \frac{\phi}{\pi} \end{array} \] Once we have the texture coordinates, we can easily create a checkerboard or put a texture. We added options such as scaling and repeat in order to control how the texture is placed.

We also support the alpha mask in order to make a color from a texture transparent.

Progressive Rendering

Ray tracing is a slow technique. At first, I generated pixels line by line, but I found out that the first few lines do not hold much information.

Instead, what we want to do is to have a fast overview of the scene and then improve on the details. In order to do that, during the first iteration we are only generating 1 pixel for a 32x32 square. Then we generate 1 pixel for a 16x16 square and so on ... We generate the top-left pixel and fill all the unknown pixels with it.

In order not to regenerate pixels we already seen, I came up with a condition to know if a pixel has already been generated. \(size\) is the current square size (32, 16, ...).

\[\left\{\begin{array}{l}
x \equiv 0 \pmod{size * 2}\\
y \equiv 0 \pmod{size * 2}
\end{array}\right.
\]

Supersampling

Aliasing is a problem with Ray Tracing and we solve this issue using supersampling. Basically, we send more than one ray for each pixel. We have to chose representative points from a square. There are multiple strategies: in the middle, in a grid or random. Check the result of various combinations in the following image:

1x: Basic
4x: 4 random 1 fixed, 3 random 2x upscale
16x: 4x upscale 2x upscale, 1 fixed, 3 random

Perlin Noise

We can generate random textures using Perlin Noise. We can control several parameters such as \(octaves\), the number of basic noise, the initial scale \(f\) and the factor of contribution \(p\) of the high frequency noises.

\[ noise(x, y, z) = \sum_{i = 0}^{octaves}{p^i * PerlinNoise(\frac{2^i}{f}x, \frac{2^i}{f}y, \frac{2^i}{f}z)} \]

\[noise\]

\[noise * 20 - \lfloor noise * 20 \rfloor\]

\[\frac{cos(noise) + 1}{2}\]

As seen in the example, we can apply additional functions after the noise has been generated to make interesting effects.

Portal

Last but not least, Portals from the self-titled game. They are easy to reproduce in a Ray Tracer and yet, I haven't seen any done.

If a ray enters portal A, it will go out from portal B. It is trivial to implement it, it is just a coordinates system transformation. Like we did for world and object transformation, we do it between A and B using their transformation matrix.

\[
\left\{\begin{array}{l}
O_{a}' = T * O_{b} \\
D_{a}' = (T * D_{b}) - (T * 0_4)
\end{array}\right.
\]

\[
\left\{\begin{array}{l}
O_{b}' = T * O_{a} \\
D_{b}' = (T * D_{a}) - (T * 0_4)
\end{array}\right.
\]

Scene Editor

In order to create scenes more easily, we have defined a scene description language. We developed a basic CodeMirror syntax highlighting script. Just enter write your scene down and press Ray Trace 🙂

Nov

Simulated Annealing Project

vjeux EPITA, Javascript, Project Add you comment

For a school project, I have to implement Simulated Annealing meta heuristic.

Thanks to many open source web tools, I've been able to quickly do the project and have a pretty display. CoffeeScript, Raphael, Highcharts, Three.js, Twitter Bootstrap, jQuery and Web Workers.

2D Demo

You have a shuffled grid and you have to find the original grid. The only operation available is to swap 2 random elements. The cost function is the distance of the point with their original neighbors.

Click to view the 2D demo

3D Demo

On this one, we are given many functions and we have to find the global minimum. The challenge was to be able to display the evolution of the algorithm, as it traverses 200k points per second.

Click to view the 3D demo

CoffeeScript Sexyness

I've written the project in CoffeeScript and I don't regret it. I find myself writing code a lot faster because I have a lot less to type. Most of the Javascript syntax is either shortened (function to ->) or optional (parenthesis, curly brackets ...) in CoffeeScript. There are also handy features such as splats, generators ...

Destructuring Assignment

worker.onmessage = (data: [id, rest...]) ->
  switch id
    when 'update'
      [cost, temperature, accepted, tried, data, force] = rest

Trimmed JSON

chart = new Highcharts.Chart
  chart:
    renderTo: 'container'
  yAxis: [
    title:
      text: 'Cost'
  ]
  tooltip:
    formatter: ->
      this.series.name + ': ' + this.y

Report

Felix Abecassis wrote a report that explains everything 🙂 It's in French, sorry!

Download PDF

Oct

JSPP – Morph C++ Into Javascript (Paper)

vjeux EPITA, Javascript Add you comment

6 months ago, I wrote the blog post "JSPP - Morph C++ into Javascript". My supervisor at the LRDE (R&D Lab of EPITA), Didier Verna, found it interesting and told me that it could be worth a publication. With his great help, I've written my first article. We have submitted it to two conferences (one after the other) and received two refusals.

Dynamic Language Symposium (DLS) 2012.
OOPS 2012: 26th ACM Symposium on Applied Computing. Special track on Object Oriented Programming Languages and Systems.

Overall, the reviewers said the article was well written and the implementation interesting but since it has no practical use, it does not deserve a publication.

Even if it didn't make it, I'm really proud of the article. If you are interested in the implementation of JSPP, it can be an enlightening read 🙂